Open shntnu opened 3 years ago
@niranjchandrasekaran - at checkin today I think I may have answered your question about "percent strong" incorrectly. We are not calculating medians in percent_strong.py
- all we do currently is determine the percentage of "group_replicates" that are higher than a quantile (95% default) of "not group_replicates".
@gwaygenomics, does that mean the current implementation of percent_strong
computes the first type of percent_strong
in https://github.com/cytomining/cytominer-eval/issues/21#issue-739130751 and not the second type?
@gwaygenomics, does that mean the current implementation of
percent_strong
computes the first type ofpercent_strong
in #21 (comment) and not the second type?
I think that is correct.
The good news is that it should require relatively little extra code to implement the second type, given that this matrix is being computed
note: we've also renamed percent_strong
to percent_matching
Potentially useful resource: https://github.com/shntnu/grit-benchmark/blob/rtests/1.calculate-metrics/cell-health/taxonomy.md
I'll copy here text by @gwaygenomics from the paper https://github.com/broadinstitute/lincs-profiling-complementarity because it's the clearest description of the method I've come across!
Constructing an appropriate null distribution to calculate reproducibility metrics In order to calculate percent replicating and percent matching metrics, we constructed matched null distributions. We designed the null distributions to control for different replicate counts between different compounds and MOAs, taking into account different replicate counts per assay. We also constructed different null distributions within each treatment dose independently to account to control for dose differences.
Specifically, for percent replicating, for a given perturbation x with n replicates of dose p, we randomly sampled n non-replicate profiles from all 1,327 common perturbations treated with dose p. We performed this sampling procedure 1,000 times per replicate cardinalityclass (e.g. compounds with 3 replicates, 4 replicates, 5 replicates, etc.) with two additional restrictions: (1) the random sample did not include replicates for perturbation x, and (2) no two compounds of the same non-x perturbation were included in the same null group. For example, in cases where a compound treatment at a specific dose had five replicates, we sampled 1,000 groups of five randomly sampled non-replicate profiles of the same dose. For percent replicating, we used level 4 profiles considering compound and dose information as replicates. We considered a replicating profile one in which the ground truth median pairwise replicate correlation was higher than 95% of the null distribution. We therefore calculate the percent replicating metric as the total number of replicating profiles over all common compounds.
For percent matching, we performed a similar procedure. The only differences were that we (1) used level 5 consensus signatures and (2) considered MOA and dose information as replicates. We subsequently constructed dose and MOA replicate count-specific null distributions to compare against. We considered a matched MOA one in which the ground truth MOA median pairwise correlation was higher than 95% of the null distribution. We therefore calculate the percent matching metric as the total number of matched MOAs over all common MOAs.
We used these null distributions to calculate a non-parametric p value. First, for each compound, we calculated its median pairwise replicate correlation. We next calculated the median pairwise correlations of each randomly sampled group matched to the specific dose and replicate count. Lastly, we calculated a compound specific p value by dividing how many times the real median pairwise correlation of replicates was higher than all 1,000 randomly sampled groups of median pairwise correlations.
Quick note because this came up when reviewing @jccaicedo 's paper:
As of Oct 2021, the definitions in https://github.com/cytomining/cytominer-eval/issues/21#issuecomment-902934931 might be inconsistent with the terminology used in the package.
I am fairly sure they will be inconsistent - although I do think the differences will be very minor. We did not use this package in that paper, and I wrote the package implementation a couple months before
Makes sense 👍 (I added this note because we were citing this issue in the LUAD gdoc (in comments, not actually citing it) and I didn't want people to get confused)
(Stubs for now, so we can add this documentation to code later)
Percent strong is reported in two ways. We should distinguish between these ways of reporting (they are similar but not the same)
The second version can be a bit confusing so here is an example: