jbloomlab / polyclonal

Model mutational escape from polyclonal antibodies.
Other
3 stars 0 forks source link

change default value of `collapse_identical_variants` #53

Closed jbloom closed 2 years ago

jbloom commented 2 years ago

The default value of collapse_identical_variants is "mean", which sort of makes sense for fitting just a single model as it pre-averages all variants with same mutations and then weights by number of observations, and thereby reduces data set size a bit and makes fitting faster (maybe by about ~30% on plausible simulated data set).

However, this sort of pre-averaging no longer makes sense in conjunction with the bootstrapping, as we want to be bootstrapping the true experimental data not after averaging together variants. For instance, if we have 1000 measurements of wildtype, right now the bootstrap would either keep all 1000 or drop all 1000, whereas if we set collapse_identical_variants=False then it will bootstrap these measurements.

I am switching the default here. Note that this is a backward-compatibility breaking change in terms of consistency of results, although I have verified the changes are quite small. I am also re-running notebooks with new set up.