selkamand / sigstats

Mathematical Operations and Transformations for Sigverse Signatures and Catalogues
https://selkamand.github.io/sigstats/
Other
0 stars 0 forks source link

Add function to help check bootstrap support #2

Closed selkamand closed 5 months ago

selkamand commented 5 months ago

If you have a vector of bootstrap contributions for a sample, there are a couple of different approaches often used to decide if your sample passes/fails bootstrap support.

One of the most common is to ask

What is the minimum proportion of bootstraps (B) that signature X contribution must be greater than some threshold value (T) before we decide its too unstable to include in the final model?

Out in he wild I've seen default of T = 5%, and B = 95%. I.e. To be included in a model the signature must contribute to >5% of the optimal model in at >95% of the bootstrapped experiments.

We should add a function sig_compute_pval_from_bootstraps that takes

  1. A user-defined threshold of T
  2. A numeric vector describing bootstrap contributions of a single signature
  3. computes the 'experimental p value' as above (i.e. what % of bootstraps the sample is >= T).

We can leave it up to the user to filter on 'B'