Closed SirKuikka closed 3 years ago
@SirKuikka
1) Raw or TPM counts should be input 2) The output data are raw count values (or "TPM" - modified to integers - if TPM were input) 3) To normalize output I would suggest DESeq2's normalization or CLR. There are other single-cell specific normalizations you may prefer. 4) It can be used with droplet based data, but I have had occasional problems with the program when working for other kinds of data in the past. I think it is related to the distributions of missing values, but that is not the only thing that causes problems. This is something I am still hoping to develop and make the simulation more broadly applicable. 5) All of the genes you simulate will have the FC you specify in your simulation (see the "foldchange" option in the function). If you want to build a dataframe of multiple levels of FC, you can simulate (say 1000) genes under the null first then iteratively increase the FC from 1 to 1.05, 1.1, 1.15, 1.2, ... etc. and continue appending those genes to your already simulated set of genes. I will note that when you specify a FC of 1, it does not mean every gene will have a FC of exactly 1, but the central tendency of those genes will be a FC of 1.
Hope this helps!
Hi,
I have several questions related to your simulator.
Thank you for taking the time to answer my questions.