brentp / goleft

goleft is a collection of bioinformatics tools distributed under MIT license in a single static binary
MIT License
214 stars 25 forks source link

indexcov: Handles at most 31 input files (bigger inputs result in nasty error message) #37

Closed kpalin closed 6 years ago

kpalin commented 6 years ago

31 samples work fine but with 32 the output ends in:

2

017/11/06 10:57:13 indexcov: running on 32 indexes panic: runtime error: index out of range

goroutine 1 [running]: github.com/brentp/goleft/indexcov.(*Index).NormalizedDepth(0xc4200b6820, 0x19, 0x0, 0xc488f646c0, 0x46) /home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:113 +0x1c6 github.com/brentp/goleft/indexcov.run(0xc4200f2840, 0x56, 0x56, 0xc4200d4300, 0x20, 0x20, 0xc4200d6600, 0x20, 0x20, 0xc48801d710, ...) /home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:520 +0x8fb github.com/brentp/goleft/indexcov.Main() /home/brentp/go/src/github.com/brentp/goleft/indexcov/indexcov.go:364 +0x529 main.main() /home/brentp/go/src/github.com/brentp/goleft/cmd/goleft/goleft.go:68 +0x17f

brentp commented 6 years ago

sorry for the troubles. lots of people are running indexcov on hundreds of samples, so it must be something odd, likely with your 32nd sample. Can you narrow it down to a single sample and send me the problematic crai?

brentp commented 6 years ago

can you let me know what organims this is as well? or send the fai file?

brentp commented 6 years ago

looks like you have 2 million sequences in there!

kpalin commented 6 years ago

Sent you a sample crai. The DNA is mostly human but the reference includes pretty much all known viral genomes.

brentp commented 6 years ago

is this one fixed by the update from yesterday?

brentp commented 6 years ago

ok. I see that it's not. I am looking into this now. Can you reindex the cram with the most recent samtools and verify you still see this? there were some bugs in the cram indexing that affected indexcov until recently.

brentp commented 6 years ago

I found the offending lines in your crai. I have posted a question to samtools mailing list to ask if it is valid, but I think not. So, I think it is either due to a bug in cram indexing or some other issue outside of indexcov. But we'll see what the response is.

brentp commented 6 years ago

See response from samtools developer here: https://sourceforge.net/p/samtools/mailman/message/36106841/

Can you make sure your cram is sorted and re-index?

kpalin commented 6 years ago

OK. Resorting and re-indexing with samtools 1.6 made the error go away in this case. I still have few other problematic crai:s to check and need to figure out why wasn't the cram sorted. Will get back to you if something major arises.