Handle with multiple annotations

The Smith-Waterman returns a list of the best-matching genes for V, D, and J, and in practice the parameters are set such that it returns every remotely plausible match. The hmm is then told to choose from among the n best s-w v matches , m d matches, and k j matches (set with --n-max-per-region n:m:k, default is 3:5:2). The hmm's best match is written to the output csv, along with a column for each region listing the per-gene support for other matches as a decimal number between 0. and 1.

I decided that having multiple gene names for a single sequence was, at least in the context of sequence analysis, an assault against all that is Right and Good in this world, so, yes, I removed one at random in the default germline set in data/germlines/. So if you want to make a different choice, you can either modify data/germlines/ or specify a different germline set with --initial-germline-dir.

psathyrella / partis

Handle with multiple annotations #227