Scott-Devine / MELT-LRA

MELT-LRA: Mobile Element Insertion Site Classifier
Other
0 stars 0 forks source link

Adjust insertion coordinates based on orientation of ME insertion. #10

Closed jonathancrabtree closed 1 year ago

jonathancrabtree commented 1 year ago

minimap2/PAV are always placing the 5' TSD inside the inserted sequence but whether it's inside or outside should depend on the orientation of the insertion. The issue is that the insertion callers are using consensus sequences not reads, so they can't tell which copy of the TSD is the "original" vs the insertion.

jonathancrabtree commented 1 year ago

In actual fact neither insertion position (before or after the existing TSD sequence) is correct, although it does kind of depend on the orientation. The double-stranded TSD is denatured as part of the ME insertion process, so when all is said and done half (i.e., one strand) of each TSD is new and half is original. Which half/strand is which presumably depends on the orientation of the insertion but there's no way to represent this in VCF or any file format that doesn't have a way to represent uneven ds-DNA breaks. PAV always places the 5' TSD inside the inserted sequence, regardless of insertion orientation, and MELT-RISC will throw an error if it detects it doing otherwise. Closing.