WGLab / LIQA

Long-read Isoform Quantification and Analysis
Other
39 stars 13 forks source link

Quantify error #4

Closed JesseBNL closed 6 months ago

JesseBNL commented 3 years ago

Hi,

Great tool you developed, and congratulations with the paper. I am using this tool to define complex isoform profiles, and so far it works well. However, I have two questions.

First, when I run the follow command line, I get an error at the end. Even though the output is generated, I would like to know what the error involves.

liqa -task quantify -refgene HLA.refgene -bam sorted_gencode.bam -out HLA_expression -max_distance 20 -f_weight 1
HLA-A   3483 reads detected...
HLA-A   40 iterations   Done!
Traceback (most recent call last):
  File "/home/algemeen/anaconda3/lib/python3.7/site-packages/liqa_src/quantify.py", line 130, in <module>
    geneStart = int(tmpgeneinf[3])
**ValueError: invalid literal for int() with base 10: ''**

Second. How is the correction for reads per gene conducted? Is it possible to obtain absolute numbers per transcript? In total 3483 transcripts mapped to HLA-A, but in the output file the #1 transcript shows 34.306.754.509.278.400 for ReadPerGene_correct, whereas the #2 shows 2.088.930.342.217.470. What does those values represent?

Thank you in advance.

huyustats commented 3 years ago

Hi @JesseBNL ,

Thank you for your interests in using LIQA. The error message indicates that the starting position of HLA-A is a invalid number. There might be some error in HLA.refgene file. Please double check it. Also, to interpret the results, could you share the output file to me? Thanks!

JesseBNL commented 3 years ago

Hi Huyustats,

Thank you for your reply.

As far as I can see there is no invalid number in the refgene file. I have attached the refgene and output files here. I added the .txt extension to make the upload possible.

HLA.refgene.txt

HLA_expression.txt

agouru55 commented 6 months ago

From the refgene.txt file, it appears like the first line is being considered as part of the EM matrix, which is why it looks like there are invalid numbers. Ensure that this is part of the header, and not part of the matrix, and see if the expression file changes at all.