We'll need to be able to handle different germline genes, of course, and for that I think that we should take the path of #12 and allow specification of germline genes with a file.
Is there anything else we should be thinking about? K, how do you anticipate dealing with SHM? Even though these are naive sorts, we're sure to get some leak-through. Shall we just throw out sequences with clear mutations in the V and J encoded sections?
We can definitely toss the ones with mutations deep inside V/J but in the junction region things are a little more tricky because of annotation uncertainty. Let me return ones I get a better look at the data.
@krdav is prepping some naive BCR data.
We'll need to be able to handle different germline genes, of course, and for that I think that we should take the path of #12 and allow specification of germline genes with a file.
Is there anything else we should be thinking about? K, how do you anticipate dealing with SHM? Even though these are naive sorts, we're sure to get some leak-through. Shall we just throw out sequences with clear mutations in the V and J encoded sections?