Improve full-repertoire reconstruction on large samples

psathyrella / partis

B- and T-cell receptor sequence annotation, simulation, clonal family and germline inference, and affinity prediction

GNU General Public License v3.0

57 stars 34 forks source link

Improve full-repertoire reconstruction on large samples #165

Open psathyrella opened 8 years ago

psathyrella commented 8 years ago

Most notably, by seeing about varying naive hfrac and logprob thresholds with sample size.

matsen commented 8 years ago

This the issue that has been much debated concerning "false positive" control under multiple testing.

matsen commented 8 years ago

Here's an idea, if we want to be quite conservative: calculate probability of near collision using the inferred parameters. Then somehow adjust our level of clustering to target a given false-positive rate.