kevinblighe / scDataviz

scDataviz: single cell dataviz and downstream analyses
60 stars 17 forks source link

populations equilibrated in processFCS? #19

Closed algarji closed 3 years ago

algarji commented 3 years ago

Hi, I was wondering if it is possible to maintain an equilibrated number of cells in each sample when applying processFCS. For example, if I have 4 samples and I want to downsample to 20000 cells in total, I want to have 5000 cells per sample. Thank you.

kevinblighe commented 3 years ago

Hey algarji, the downsampleVar functionality currently operates on a per sample basis (see https://github.com/kevinblighe/scDataviz/blob/master/R/processFCS.R#L179-L190); however, the downsample functionality is only performed on the final merged dataset (see https://github.com/kevinblighe/scDataviz/blob/master/R/processFCS.R#L211-L227). Unfortunately, there is no way to downsample to a given number of cells on a per sample basis.

I did it this way because, if you downsample on a per sample basis, you can eliminate the information that is brought by sequencing depth (or its equivalent in a mass spectrometer). For example, if we have two samples of 3000 cells and 50000 cells, we can downsample both to 1000 [cells]; however, it may then appear that expression is less in the 50000 sample (as it has more cells, each cell receives less reagent, and therefore less 'per cell' signal).

I can aim to implement downsampling on the per sample basis, if you wish, though. Given my time constraints, it may take a number of weeks.

Kevin

kevinblighe commented 3 years ago

Please re-open if required