psathyrella / partis

B- and T-cell receptor sequence annotation, simulation, clonal family and germline inference, and affinity prediction
GNU General Public License v3.0
55 stars 34 forks source link

Should we handle VDDJ rearrangements? #226

Open psathyrella opened 7 years ago

psathyrella commented 7 years ago

Might not be too hard, and in cases where there's really two Ds it'll of course make a huge difference to annotation accuracy.

scharch commented 7 years ago

If you decide to implement this, the program should be aware of the order of various D segments on the chromosome to cut down on false positives. That is, you could have D1-1 followed by D1-7, but not vice versa. See Figure 5 of Briney et al Immunology 2012 for more details.

On Tue, Feb 14, 2017 at 2:47 PM, Duncan Ralph notifications@github.com wrote:

Might not be too hard, and in cases where there's really two Ds it'll of course make a huge difference to annotation accuracy.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/psathyrella/partis/issues/226, or mute the thread https://github.com/notifications/unsubscribe-auth/AGZe4MGx404sPTJwMOfzx2eg4qtnREtBks5rcgTAgaJpZM4MA6X- .

psathyrella commented 7 years ago

ooh, excellent, thanks for the tip.

I also just realized this should be easy to implement -- we just need smith-waterman to look for them during parameter caching, and if it finds any it just adds the smooshed-together double D to the germline set that gets passed to the hmm.

scharch commented 6 years ago

Would that work? There would typically be N-nucleotides inserted between the two D genes...

psathyrella commented 6 years ago

huh, no, not if there's insertions between the Ds. I don't think there's much chance of adding two Ds to the hmm, that'd be crazy complicated. But the sw annotations are only a little less accurate than the hmm ones, anyway, so we could just look for double Ds there.