mortazavilab / lapa

Alternative polyadenylation detection from diverse data sources such as 3'-seq, long-read and short-reads.
https://www.biorxiv.org/content/10.1101/2022.11.08.515683v1
22 stars 12 forks source link

Refining transcript 3' and 5' ends with LAPA #18

Open MustafaElshani opened 1 year ago

MustafaElshani commented 1 year ago

Hi @MuhammedHasan

I see that you have 'lapa.correction.Transcript' how would I be able to use it.

To my understanding default LAPA deals with 'gene_id' how would I change the analysis so that it looks at the 'transcript_id' instead?

As always thank you Mustafa

MuhammedHasan commented 1 year ago

Hi @MustafaElshani

LAPA can create gtf file with novel TES and TSS, when integrated with TALON.

Refer to https://github.com/mortazavilab/talon repo first and create GTF from your long-reads.

Then you can refine your 5' and 3' ends of transcripts LAPA: https://lapa.readthedocs.io/en/latest/cli.html#lapa-correct-talon

MustafaElshani commented 1 year ago

Hi @MuhammedHasan

I was planning to go through TALON. So if I understand correctly LAPA ,as is, without TALON, does not do APA site usage analysis at transcript level only a gene level?

Mustafa

MustafaElshani commented 6 months ago

Hi I have not run TALON successfully on the the data generating the talon.gtf and then I run the following

!lapa_correct_talon --links ./C_tss_tes_linked_ALL.csv \
                    --read_annot TALON/C_talon_read_annot.tsv \
                    --gtf_input TALON/C_talon.gtf \
                    --abundance_input TALON/C_talon_abundance_filtered.tsv \
                    --gtf_output C_talon_corrected.gtf \
                    --abundance_output C_talon_abundance_corrected.tsv

This generating the abundance corrected and corrected_gtf. From this point how would you go about analysing the APA ?