Closed koisland closed 7 months ago
In the HGSVC3 assemblies, for certain chromosomes, filtering dna-brnn output incorrectly omits smaller, correct HOR arrays.
The last two samples were omitted because their lengths following running bedminmax.py were less than 1,000,000 bp.
bedminmax.py
(base) [koisland@sarlacc dna_brnn]$ cat chrY_H*_contigs.fwd.ALR.bed chrY_N*_contigs.fwd.ALR.bed NA19239_chrY_haplotype1-0000027 *10010673 10027694 2 17021 NA19239_chrY_haplotype1-0000027 10142915 10147965 2 5050 NA19239_chrY_haplotype1-0000027 10150015 10154765 2 4750 NA19239_chrY_haplotype1-0000027 10155962 10160265 2 4303 NA19239_chrY_haplotype1-0000027 10162015 10176265 2 14250 NA19239_chrY_haplotype1-0000027 10183415 11016615 2 833200 NA19239_chrY_haplotype1-0000027 11017015 *11021215 2 4200 (base) [koisland@sarlacc dna_brnn]$ cat chrY_H*_contigs.rev.ALR.bed chrY_N*_contigs.rev.ALR.bed HG00096_chrY_haplotype1-0000033 *23777138 23794155 2 17017 HG00096_chrY_haplotype1-0000033 23656906 23661955 2 5049 HG00096_chrY_haplotype1-0000033 23650105 23654805 2 4700 HG00096_chrY_haplotype1-0000033 23644919 23648905 2 3986 HG00096_chrY_haplotype1-0000033 23628905 23643205 2 14300 HG00096_chrY_haplotype1-0000033 23294605 23621805 2 327200 HG00096_chrY_haplotype1-0000033 23290005 *23294205 2 4200 HG03732_chrY_haplotype1-0000027 *41879184 41896234 2 17050 HG03732_chrY_haplotype1-0000027 41758934 41764034 2 5100 HG03732_chrY_haplotype1-0000027 41752134 41756884 2 4750 HG03732_chrY_haplotype1-0000027 41746634 41750934 2 4300 HG03732_chrY_haplotype1-0000027 41730634 41744884 2 14250 HG03732_chrY_haplotype1-0000027 41171934 41723485 2 551551 HG03732_chrY_haplotype1-0000027 41167333 *41171534 2 4201
In the HGSVC3 assemblies, for certain chromosomes, filtering dna-brnn output incorrectly omits smaller, correct HOR arrays.
The last two samples were omitted because their lengths following running
bedminmax.py
were less than 1,000,000 bp.