csw / bioruby-maf

MAF parser for BioRuby
MIT License
11 stars 6 forks source link

Bio-maf not working with new MAF-files #117

Open ghost opened 9 years ago

ghost commented 9 years ago

The new MAF-files are not split on chromosomes, so when I try to look up i.e. hg38.chr1 I get the message

/usr/local/lib/ruby/gems/2.1.0/gems/bio-maf-1.0.1/lib/bio/maf/index.rb:266:inchrom_index': No index available for chromosome hg38.chr1! (RuntimeError)`

However, if I try to look up hg38.chr8 it works, presumably because that is the very first chromosome in the maf-file.

endrebakkenstovner@dhcp-033193 ~/pipeline_sirnas> head -3 data/hg38.7way.maf
##maf version=1 scoring=blastz
a score=96727.000000
s hg38.chr8     60297 1234 + 145138636 AAAGACTTCTTGTCTTTATTTTGTTCCCATGCCTACCTTTTAGCCATAATACAACAGAAblablabla

This happens to all other MAF-parsers too, so if you update yours you will have the only one that actually works.

ghost commented 9 years ago

Thanks to an earlier issue here, I've come to realize that the problem is likely not multiple chromosomes, but rather the fact that the chromos are scattered all over the file.

ack -oh "hg38.chr[0-9]*" data/hg38.7way.maf | uniq

results in output like

...
hg38.chr9
hg38.chr17
hg38.chr1
hg38.chr17
hg38.chr11
hg38.chr17
hg38.chr
hg38.chr11
hg38.chr1
hg38.chr4
hg38.chr11
hg38.chr17
hg38.chr15
...