jradrion / TEFLoN

TEFLoN uses paired-end illumina sequence data to discover and genotype transposable elements present in your samples.
13 stars 7 forks source link

Interpreting the breakpoint coordinates #7

Closed pbasting closed 3 years ago

pbasting commented 3 years ago

Hi @jradrion

I'm currently working on revamping and adding new TE detection methods to the McClintock pipeline and am interested in integrating TEFLoN, I just have a few questions about interpreting the output. Specifically, I want to make sure I am interpreting the breakpoint coordinates correctly.

my interpretation

5' Breakpoint Evidence: 84 85 86 87 88 89 90 91 92 -- -- -- -- -- -- 3' Breakpoint Evidence: -- -- -- -- -- 89 90 91 92 93 94 95 96 97 98


##
Ultimately, my goal is to convert the predictions from TEFLoN's genotype file to a BED format that contains the interval for non-reference insertion TSDs so I want to make sure I am interpreting the positions correctly so I'm not inadvertently making the predictions worse.

Thanks in advance for any help you can provide.

Best,

Preston
jradrion commented 3 years ago

Hi @pbasting,

Sounds good. I haven't spent much time with this code in recent years, but I just took another look through parts of it and I am nearly certain that I am giving you the correct responses.

Are the coordinates 0-based or 1-based?

0-based

Do the breakpoint positions indicate the final position in the reference genome before you see evidence of an insertion, or the position where you see the transition from reference to insertion?

It should be the former, as the last mapped position adjacent to start of soft-clipping.

Am I correct in interpreting a prediction with a 5' breakpoint larger than the 3' breakpoint as a non-reference insertion with a TSD?

Yes, it is consistent with the presence of a TSD (of length 4 in your example). The usual caveat applies that it could also be caused by mapping error.

I hope this helps. Feel free to ask any other questions and I'll do my best to answer.

Jeff

pbasting commented 3 years ago

Thanks Jeff, that's very helpful! I'll let you know if anything else comes up.