Scott-Devine / MELT-LRA

MELT-LRA: Mobile Element Insertion Site Classifier
Other
0 stars 0 forks source link

Compute percent identity using only the portion of the alignment used in the coverage calculation. #8

Closed jonathancrabtree closed 1 year ago

jonathancrabtree commented 1 year ago

Only the part of the alignment that's used to determine coverage should be used to determine percent identity. See #9.

jonathancrabtree commented 1 year ago

For example, the first two bases of the following ALU alignment would not be considered when computing the percent identity because they overlap with the TSD and are hence excluded from the coverage calculation:

chr22:25792353  |ALU  |+  |100.0%| 95.1%| 97.5%|100.0%| ATGCTGAGAT [AAGAAGTGGATTGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCA....+224bp....CGACCTGGGCGACAGAGCGAGACTCCGTCTCAAAAAAAAAAAAAA] AAGAAGTGGATTGGTGAGATGCTGAGCGAC
                                                                    ^^^^^TSD^^^^^^                                                                            <---polyA---->  ^^^^^TSD^^^^^^
                                                                                [ALU-----------------------------              ---------------------------ALU>
jonathancrabtree commented 1 year ago

Probably not high priority because you can see in the current output which alignments have a significant difference between the gapped and ungapped percent identity.

jonathancrabtree commented 1 year ago

old "nogaps" percent identity = 90.4%, new= 77% (old "total" = 34%):

Screen Shot 2023-10-20 at 12 12 51 PM