WGLab / LIQA

Long-read Isoform Quantification and Analysis
Other
37 stars 12 forks source link

single isoform genes are excluded #15

Closed nhartwic closed 3 months ago

nhartwic commented 2 years ago

I get that LIQA was designed for differential isoform abundance analysis, but is there an actual technical reason single isoform genes are excluded from quantification? As is, it would be really nice feature to have for people who want to both do a DE analysis AND an isoform level analysis.

huyustats commented 2 years ago

Thank you for you interests in using LIQA and valuable suggestions! Currently we are working on DE detection part and single isoform gene expression will be quantified. Thanks!

alanlamsiu commented 2 years ago

Hi~ I just gave LIAQ a try recently. As @nhartwic mentioned, gene names with only one isoform were not found in the .refgene file, and thus, will not be included in the quantification output. Additionally, since LIQA group genes by gene_names, it is possible that there are two genes with the same gene_name but different gene_ids, e.g. there is such an issue in GENCODE v31. Therefore, using gene_name as the ID will group gene with different gene_ids in terms of their expression levels.

huyustats commented 2 years ago

@alanlamsiu Thank you for your feedback. I am updating the code to output single isoform gene read count and expression level.

AyushSemwal commented 1 year ago

Is the code updated for this issue?

baraaorabi commented 1 year ago

Should there be any downstream issues if these lines were changed?

https://github.com/WGLab/LIQA/blob/8e098567a0d0d0d9e9318cf80e044441b51bd93a/liqa_src/PreProcess_gtf.pl#L133

https://github.com/WGLab/LIQA/blob/8e098567a0d0d0d9e9318cf80e044441b51bd93a/liqa_src/PreProcess.pl#L81

huyustats commented 1 year ago

Should there be any downstream issues if these lines were changed?

https://github.com/WGLab/LIQA/blob/8e098567a0d0d0d9e9318cf80e044441b51bd93a/liqa_src/PreProcess_gtf.pl#L133

https://github.com/WGLab/LIQA/blob/8e098567a0d0d0d9e9318cf80e044441b51bd93a/liqa_src/PreProcess.pl#L81

Hi Thanks! As long as single isoform is included in refgene file, it wont affect downstream analysis. Thanks!

baraaorabi commented 1 year ago

But will it produce stats for these single isoform genes? I want to use LIQA but I want to get counts for single isoform genes as well

huyustats commented 1 year ago

Yes, it will produce stats for single isoform genes when you specify size to 0 instead of 1. Thanks!

baraaorabi commented 1 year ago

@huyustats I created a PR (https://github.com/WGLab/LIQA/pull/29) with minimum isoforms per gene added as a command line parameter. I tested it and when set to 2 it produces same results as before (expected behaviour). Do mind checking it?

huyustats commented 1 year ago

@huyustats I created a PR (#29) with minimum isoforms per gene added as a command line parameter. I tested it and when set to 2 it produces same results as before (expected behaviour). Do mind checking it?

Thank you! I will check it.