TAMA Merge Parameters - Githubissues

GenomeRIK / tama

Transcriptome Annotation by Modular Algorithms (for long read RNA sequencing data)

GNU General Public License v3.0

128 stars 25 forks source link

Hi Richard,

Thanks for the suite of tools! I have a question about TAMA Merge parameters. I have carried out nanopore RNA-seq (DCS109 w/ native barcoding) on 18 samples for an immune challenge study. I have 3 conditions; Control, Vibrio, PolyIC, and n=6 biological replicates for each condition. The processing steps I’ve already taken are:

Basecalled with Guppy
Filtered low quality reads (q<10) with nanofilt
Demultiplexed with Guppy
Filtered (supposedly) ‘full-length’ reads with Pychopper
Trimmed polyA tails with CutAdapt
Mapped to genome with Minimap2
Collapsed each sample individually with TAMA_Collapse with the script below:

Now, I would like to merge my samples together to form one large transcriptome that I can take forward and compare to the existing reference. I understand I can use TAMA merge to do this but I’m unsure on the parameters required. I think I should be using the default TAMA_Merge parameters except I should be using –d merge_dup? Is this correct? Secondly, in my filelist.txt, I think I will use the no_cap flag rather than capped, however, I’m unsure of this as I used no_cap in TAMA_Collapse. Will I lose information by using the no_cap flag again in TAMA_Merge?

My next steps would be to filter out models with low read support (<3), filter out single-exon models with low read support (<50) and then use SQANTI3 to get a comparison between my long-read transcriptome and the existing reference. Does this sound sensible?

Many thanks and sorry for the long question! Oliver

Hi Oliver,

Thank you for using TAMA! And apologies for not responding sooner.

Very cool pipeline you have put together!

The answers to your questions are exactly simply. It depends a lot on what exactly you want to achieve.

However in general I would say:

I think I should be using the default TAMA_Merge parameters except I should be using –d merge_dup? Is this correct?

I recommend running TAMA Merge with –d merge_dup -a 100 -x 10 -z 100

Secondly, in my filelist.txt, I think I will use the no_cap flag rather than capped, however, I’m unsure of this as I used no_cap in TAMA_Collapse. Will I lose information by using the no_cap flag again in TAMA_Merge?

If you used no_cap in TAMA Collapse then I would use no_cap in TAMA Merge. Because your library is likely going to have significant degradation so any isoform level information that could be identified with the capped mode would not be very certain. Did you happen to run Deg Sig on your data?

My next steps would be to filter out models with low read support (<3), filter out single-exon models with low read support (<50) and then use SQANTI3 to get a comparison between my long-read transcriptome and the existing reference. Does this sound sensible?

This is really up to you. It is a bit more conservative than I usually do (50 reads is a lot for nanopore) but that's fine if would want more stringency. As for using SQANTI3, I think that's a good idea although admittedly I have not been keeping up with current SQANTI3 features. You could also try using TAMA Merge to compare with the reference but that is more hands on for downstream analysis whereas SQANTI3 does a great job at auto running a lot of downstream stuff.

Hope that helps! Richard

GenomeRIK / tama

TAMA Merge Parameters #93