Closed javier-marchena-hurtado closed 1 year ago
Hi @javiermarchena Thank you for using SymSim. You are right that scale_s is the parameter to set in your case. However, currently, scale_s is a scalar and if you pass it a vector, most likely it is the first value that is used. We can change the code so that when scale_s is a vector it is used for different populations and we are adding this in our new simulator, scMultiSim, which simulates single cell multi-omics data (you can certainly simulate only scRNA-seq data). You can find scMultiSim here https://github.com/ZhangLabGT/scMultiSim.
Great, thanks a lot!
Hello,
I would like to simulate scRNA-seq data with 5 populations, and I would like to specify the cell size of each of these populations. In other words, I would like to specify that certain populations should have more total RNA counts than other populations. How can I do that?
The only way that I found for now is by using the scale_s parameter. But I'm not sure whether I can pass a vector to the scale_s parameter indicating the cell size of each population.
The code I am using now is:
ngenes = 10000 ncells = 500 phyla = Phyla5() true_counts_rna = SimulateTrueCounts(ncells_total=ncells, ngenes=ngenes, evf_type="discrete", Sigma=0.5, randseed=0, phyla=phyla, min_popsize=50, nevf=10, n_de_evf=9, vary="s", scale_s = c(0.1, 0.3, 0.5, 0.7, 0.9))
This is what the tSNE looks like:
And this is the violin plot of total RNA counts grouped by population:
At least the populations do have different total counts. But the total counts are not gradually increasing from population 1 to population 5, the way I intended when I passed scale_s = c(0.1, 0.3, 0.5, 0.7, 0.9).
Is there any way to better specify the cell size of each population? Thanks in advance.