This is (or will be) a Dynamic Statistical Comparison to estimating (or testing) the "log-fold-change" in mean between two groups from count data.
Our intention is to initially focus on data from single-cell experiments.
The goal is to compare methods for estimating the log-fold-change between two groups.
Methods will input:
and optionally:
Methods will output:
We will create synthetic data (from real data) that have known log-fold-change values, and compare the estimates with the real values. We will also assess calibration of p values (eg on null data, we should get uniform p values) and power.
To create data we will take a file containing count data and select samples at random to create two groups. These will be "null" data.
Input:
Output:
List methods we might want to use...
We manage package installation using conda. Aftering installing conda, you have two options for installing the dependencies:
conda-env
:
conda env create --file environment.yaml
source activate dsc-log-fold-change
conda install
:
conda config --add channels jdblischak
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda install --file requirements/conda-forge \
--file requirements/bioconda \
--file requirements/jdblischak
The advantage of the first option is convenience. It creates an environment and installs packages from specific channels, all in one step. The advantage of the second step is that it is faster and more robust.
If you need to add a new package for the benchmark, please add it to both
environment.yaml
and to one of the files in requirements/
.
The main DSC file is benchmarks.dsc
. To see what is available:
./benchmark.dsc -h
and to run the benchmark:
./benchmark.dsc
Or to run a minimal test benchmark, eg
./benchmark.dsc --target "get_data * wilcoxon_test" --truncate --replicate 1 # default is in fact --replicate 1