GenomeRIK / tama

Transcriptome Annotation by Modular Algorithms (for long read RNA sequencing data)
GNU General Public License v3.0
128 stars 25 forks source link

What kind of fasta should be used for sam file? #7

Closed linglingtingfei closed 5 years ago

linglingtingfei commented 5 years ago

Hi @GenomeRIK, I want to try TAMA to find APA(Alternative PolyAdenylation ). You know, FLNC reads were removed polyA tails. BUT I'm confused which fasta should be used, CCS.fasta, FLNC.fasta, or High-quality fasta, for producing sam file. I've already tried some software, like TAPIS, IDP-APA. But the results seem not to be good. Hope you can reply to me soon, I'd greatly appreciated that.

GenomeRIK commented 5 years ago

Hi linglingtingfei,

Are you asking which fasta file to use for running a mapper like GMAP or Minimap2? If so, it depends a bit on what you are interested in. You could map the FLNC.fasta file or you could map the high quality fasta file. I suggest trying the FLNC fasta first as the high quality fasta fille contains clustered reads which means some of the APA information could be lost during clustering.

Cheers, Richard

GenomeRIK commented 5 years ago

Hi linglingtingfei,

Also in case that is not what you were asking could you please clarify?

Thank you, Richard

linglingtingfei commented 5 years ago

Hi @GenomeRIK , Thank you for your quick reply. I'm interested in finding APA by using the TAMA. As you said in the TAMA manual, a sorted sam file was required. Then, I need to first map long fasta file to the genome for generating sam file. In this step, I'm wondering which fasta file should be used (FLNC, CCS or High-quality reads ). maybe I can try it to use FLNC reads according to your suggestion.

GenomeRIK commented 5 years ago

Hi linglingtingfei,

Yes you can use FLNC or high-quality for mapping. Just don't use the CCS as they still have adapters and poly-A tails which messes up the mapping. Mapping doesn't take much time so you can try both FLNC and high-quality mapping easily. Then just run TAMA collapse on the sorted sam file and make sure to supply the genome fasta for the Collapse run.

I am going to close this issue now but feel free to open a new one if you have a different question. Or if you have more to ask about this particular issue, you can still add comments but you'll have to find the thread in the closed issues page.

Cheers, Richard