Open psathyrella opened 8 years ago
This the issue that has been much debated concerning "false positive" control under multiple testing.
Here's an idea, if we want to be quite conservative: calculate probability of near collision using the inferred parameters. Then somehow adjust our level of clustering to target a given false-positive rate.
Most notably, by seeing about varying naive hfrac and logprob thresholds with sample size.