Open Carldeboer opened 7 years ago
Just encountered the same issue. I think the problem is with chromosome count: I have a hg38 bigwig file from UCSC. This file contains all the alternative chromosomes like chr4_KI270925v1_alt. bxpython's get_as_array() "sees" only chromosomes 1 to 5 and then 10 to 22. Investigating this a little more carefully, it shows up that if chromosomes are alphabetically ordered, bxpython only "sees" exactly the first 256 chromosomes - it finds everything up to "chr5_GL383532v1_alt" (256th chromosome) and does not see anything from "chr5_GL949742v1_alt" (257th chromosome)
I have a program that uses bx-python to extract data from a bigwig file for specific BED loci. For some reason certain chromosomes are sometimes skipped for having no data (BigWigFile.get_as_array( chrom, startPos, endPos ) returns None). Looking at the contents of the bigwig file manually or on IGV indicates that there is in fact data for the "missing" chromosomes: This is an example peak that is being skipped:
chr6 157355 157505 chr6:157355-157505
Same region on IGV:Looking at this region on the command line:
The following code is enough to reproduce the issue:
and produces this output:
Note that None appears to only be returned when a chromosome is not found; when the chromsome is found and there is no data for the region, it returns a list of nans.
I can provide the BW if that would be helpful. I need somewhere to put it (150MB).