WGLab / LIQA

Long-read Isoform Quantification and Analysis
Other
37 stars 12 forks source link

ValueError: invalid contig while quantifying isoform expression #8

Closed saifashraf closed 3 months ago

saifashraf commented 3 years ago

Hello Hi Yu and Kai,! Thanks for making this software available freely. While trying to quantify isoform expression for a test dataset (bam file attached) aligned through minimap2 and sort indexed through samtools with command liqa -task quantify -refgene human.refgene -bam /path/to/bam/file/C1.bam -out isoform_expression_estimates_C1 -max_distance 20 -f_weight 1 got this error Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/liqa_src/quantify.py", line 153, in for read in bamFilePysam.fetch(geneChr, geneStart, geneEnd): File "pysam/libcalignmentfile.pyx", line 1082, in pysam.libcalignmentfile.AlignmentFile.fetch File "pysam/libchtslib.pyx", line 685, in pysam.libchtslib.HTSFile.parse_region ValueError: invalid contig 5

I'm wondering where I'm going wrong. Its throwing same error while using example datasets provided with [Uploading C1.bam.txt…]() this package

Thanks in advance for your help C1.zip

huyustats commented 3 years ago

Hi @saifashraf,

Thanks for your interests in using LIQA. After checking your uploaded bam file, I found that all reads were aligned to transcriptome instead of genome. LIQA is designed to quantify isoform expression by analyzing mapped reads from genome. Also, the chromosome information in BAM file should match the human.refgene reference file. Thanks!

rsalz commented 2 years ago

Hi @saifashraf,

Thanks for your interests in using LIQA. After checking your uploaded bam file, I found that all reads were aligned to transcriptome instead of genome. LIQA is designed to quantify isoform expression by analyzing mapped reads from genome. Also, the chromosome information in BAM file should match the human.refgene reference file. Thanks!

I was actually planning on doing the same thing as OP. What is the reason behind disallowing alignments to a transcriptome? With the way LIQA is currently set up it does not seem to allow quantification of novel isoform sequences? I would like to use the novel sequences I discovered using TALON and quantify them with LIQA. Could I do this successfully with just the genome alignment and a refgene file containing the novel sequences, you think? (I tried this already, but am running into the same pysam error that I am having when i use the reference annotation so I'm not sure it will work)

huyustats commented 2 years ago

Hi @rsalz, Thanks for your interests in using LIQA. Yes, LIQA allows the quantification of novel transcripts correctly defined in refgene file. Could you upload the file (bam and refgene) you used? I will fix the pysam error. Thanks!

rsalz commented 2 years ago

Hi @huyustats, could you send me your email address? My data is unpublished and i would thus like it to be treated as confidential.

huyustats commented 2 years ago

Hi @rsalz, my email is huyu999999@gmail.com. Thanks!