Help understanding tama_remove_fragment_models.py discard result

Hi Richard,

I have a test bed file with the following two transcripts that differ only in the size of the last exon (PB.6480.7 is 702 bp longer than PB.6480.5):

Chr1 113145948 113213177 PB.6480;PB.6480.7 40 + 113145948 113145948 255,0,0 7 368,1014,161,107,260,167,1066 0,53232,56316,59800,63135,65470,66163 Chr1 113145948 113212475 PB.6480;PB.6480.5 40 + 113145948 113145948 255,0,0 7 368,1014,161,107,260,167,364 0,53232,56316,59800,63135,65470,66163

When running tama_remove_fragment_models.py with default parameters, transcript PB.6480.5 is being discarded. Could you please explain why that is? If I understand correctly, tama_remove_fragment_models.py should remove fragment models that differ from the longer model on both the 5' and 3' ends up to a certain length threshold. By default, exon ends threshold/ splice junction threshold is 10bp and trans ends wobble threshold is 500bp. The two transcripts in my bed file only differ at one end, and the difference exceeds the threshold, so shouldn't both be kept? I have tried changing to lower thresholds but still the same result. Any clarifications for why this is the case?

Many thanks! Iulia Darolti

GenomeRIK / tama

Help understanding tama_remove_fragment_models.py discard result #138