Clarification splitting large SAM file to run collapse and then merge

GenomeRIK / tama

Transcriptome Annotation by Modular Algorithms (for long read RNA sequencing data)

GNU General Public License v3.0

125 stars 24 forks source link

Clarification splitting large SAM file to run collapse and then merge #62

Closed J-Calvelo closed 2 years ago

J-Calvelo commented 2 years ago

Hello. This must be somewhat basic but I'm not entirely sure from this line on the wiki if I should set -a to 0 or run it with default parameters.

You can merge the resulting bed12 files using TAMA Merge (**with no wobble**) and the transcript and gene models will be identical to running TAMA Collapse on the whole SAM file.

Thanks

GenomeRIK commented 2 years ago

Hello,

Could you give me more details about what you are trying to do?

Thank you, Richard

J-Calvelo commented 2 years ago

I'm trying to annotate a genome using TAMA and PACBIO isoseq data. But the after selecting the full isoforms I'm left with 1708667 and I need to split them to save memory and then merge them again.

GenomeRIK commented 2 years ago

After which step are you getting 1708667 isoforms?

J-Calvelo commented 2 years ago

After running isoseq3 refine

GenomeRIK commented 2 years ago

What are you trying to run after refine?

lscdanson commented 4 months ago

Hi I also would like to seek clarification over what you mean by "no wobble". I've split my bam file for the TAMA collapse step and am now trying to merge the split bam files back together using TAMA Merge. Does that mean I should set all three of the -a -m -z options as 0 instead of the default values?

Many thanks.

lscdanson commented 4 months ago

Also if I'm simply merging the split bam files back together (all no_cap), should I set the "merge_priority" in the filelist.txt as "1,1,1" or "2,1,1"?