jeromekelleher / sc2ts-paper

3 stars 5 forks source link

Identify recombinants which we are reasonably certain of, which have not previously been detected #213

Open hyanwong opened 1 week ago

hyanwong commented 1 week ago

With the new system, we are producing fewer recombinants, each of which is hopefully more trustworthy. We should be able to compare the HMM likelihood with the equivalent value when no recombination is allowed (mismatch -> infinity). We can express this as an likelihood ratio, and pick the recombination nodes with high LRs, but which have no Pango-X designation, as ones of which we are reasonably certain but which were not previously identified.

jeromekelleher commented 1 week ago

Good idea. We don't actually get a likelihood out of the HMM though, just a path. We can compute the likelihood/probability posthoc though, as mu^{num_mismatches} etc.

hyanwong commented 1 week ago

Great: any way we can get a LR would be fine. I presume that the main way we could get misled is by time travellers. So it may be worth plotting the LR on the X axis against some measure correlated with our suspicion of time travel on the Y. I'm not sure what that measure would be, however.