J35P312 / TIDDIT

TIDDIT - structural variant calling
Other
10 stars 0 forks source link

IndexError: index out of bounds with hs38DH.fa aligned bam #9

Closed countdigi closed 6 years ago

countdigi commented 6 years ago

I'm running into the following exception when processing a bam aligned against UM's GRCh38 reference genome fasta (ftp://share.sph.umich.edu/gotcloud/ref/hs38DH-db142-v1.tgz):

    Traceback (most recent call last):                                                                                                                                      
      File "/TIDDIT/TIDDIT.py", line 54, in <module>                                                                                                                        
        TIDDIT_clustering.cluster(args)                                                                                 
[hg38.bam.gz](https://github.com/J35P312/TIDDIT/files/2048193/hg38.bam.gz)
[hs38DH.bam.gz](https://github.com/J35P312/TIDDIT/files/2048194/hs38DH.bam.gz)
[TST01_R1.fastq.gz](https://github.com/J35P312/TIDDIT/files/2048195/TST01_R1.fastq.gz)
[TST01_R2.fastq.gz](https://github.com/J35P312/TIDDIT/files/2048196/TST01_R2.fastq.gz)

      File "/TIDDIT/TIDDIT_clustering.py", line 681, in cluster                                                                                                             
        header,chromosomes,library_stats=signals(args,coverage_data)                                                                                                        
      File "/TIDDIT/TIDDIT_clustering.py", line 164, in signals                                                                                                             
        coverage_data[chrB][int(math.floor(posB/100)),2]+=1                                                                                                                 
    IndexError: index 18 is out of bounds for axis 0 with size 18                                                                                                           
    time: 7:52.05       

I have run a binary divide batch script and have zeroed this down to a small fastq pair.

The standard UCSC hg38.fa is working with TIDDIT for comparison.

Does anything stick out at you or any suggestions how I can help further investigate?

I am providing the small fastq pair and the bam aligned against UM and hg38 with the bam aligned against UM. You should only experience the above exception with the one aligned against hs38DH (UM's fasta).

Thanks! Kevin

J35P312 commented 6 years ago

Hello there! I will have a look at this! I ran tiddit using the hs38DH bam file, but it did not crash. I used the following command

python TIDDIT.py --sv --bam /home/jesper/Downloads/hs38DH.bam --ref /home/jesper/external_data/references/hs38DH.fa

Could you send me the tiddit command you are using? Also, double check that you are using the latest version of tiddit (2.2.2 I think)! Also make sure that you are having write permissions and that your harddrive is not full ( I got a similar error once when I had too little space on my harddrive).

What version of python are you using? I'm using 2.7.14

Good luck! =P output.vcf.zip

countdigi commented 6 years ago

Apologize - although i automated the testing I must have fat-fingered something as I was translating it for the issue report as I am getting an ok now as well. I'll get back to you and make sure i provide the commit sha as well and will make sure things are up to latest - and thanks for helping - so is this the new fork from SciLifeLab/TIDDIT? Thanks again! Kevin

J35P312 commented 6 years ago

No problems! And thanks for testing tiddit! The SciLifeLab/TIDDIT fork is kinda like a stable fork, I only push updates to it when I have tested things properly (and some times I forget to update it as well =P). I use this fork for research projects and development, it is a more up to date, but sometimes a bit more buggy. I pushed the latest stuff to the SciLifeLab/TIDDIT fork, now they are both up to date!