Closed macovskym closed 1 year ago
Effectively adds option for "nested subsampling" in which user can choose to create subsamples from cell types and/or organisms, specifying desired percentage of cells and/or organisms to be selected for each sample. No errors/issues found in testing (AB, 8/31).
Nested subsampling creates a subsample by sampling both the organisms in each treatment and the cells in each cell type in the selected organisms.
For aggregate data, the only available subsampling mode is over organisms. For single-cell data, the mode is specified in the following way:
-n
argument, proportion of cells to include given to-p
argument.-o
/--organism
: Within each treatment, take samples of the organisms, but include all cells from each selected organism. Number of samples given to-n
argument, proportion of organisms in each treatment to include given to-p
argument.-b
/--both-organisms-and-cells
(NEW): For each subsample, take a sample of the organisms. Then, for each combination of selected organism and cell type, take a sample of cells. The final subsample is the collection of all cells selected in this manner. When using this option, the argument given to-b
is the proportion of organisms to take, and the argument given to-p
is the proportion of cells to take from each cell type within each organism.