Closed spriyansh closed 7 months ago
Hi @spriyansh. Thanks for the interesting analysis. I think you might be pushing the limits of what the simulation is designed to do but I can't see any obvious problems with what you have done and it should produce something like the effect you are looking for. Can you please trying plotting the BaseCellMeans
or CellMeans
assay instead? That will remove some of the other steps which might be confusing things. Setting lib.scale = 0
might also help.
Is there a reason you close to do it this way rather than using the variation built into the paths? You should be able to get something pretty similar that way.
Hi @lazappi, thank you for the prompt response.
I want to keep track of the changes in the paths along the steps. That's why I put one path after another, so I can know precisely when the change happens in a path. Correct me if I'm wrong, but based on the documentation, I believe setting the path.nonlinearProb
and path.sigmaFac
will surely introduce local variations in a path. However, the record of where the change is introduced is not kept; only the magnitude and probability of the change are. For my specific analysis, I want to know the locations of the change. The goal is to simulate and annotate trends like increases, decreases, convex and concave shapes, etc.
Also, the parameters that I am using for the simulation are learned from the 10x V2 dataset published by Setty et al., 2019. This dataset is produced using a snapshot experiment of differentiating cells. Here is a quick summary of the non-default parameters learned from a real dataset:
Batches:
[BATCHES] [BATCH CELLS] [Location] [Scale] [REMOVE]
1 200 0.1 0.1 TRUE
Mean:
(RATE) (SHAPE)
2.85866981975563 3.33448187687623
Library size:
(LOCATION) (SCALE) (Norm)
9.03025915193446 0.364613373259578 FALSE
Exprs outliers:
(PROBABILITY) (LOCATION) (SCALE)
0 2.19848115217974 0.966580327002644
Groups:
[GROUPS] [Group Probs]
1 1
Diff expr:
[PROBABILITY] [Down Prob] [LOCATION] [SCALE]
1 0.5 1 1
BCV:
(COMMON DISP) (DOF)
0.153925836596722 39.7270706976494
Dropout:
[TYPE] (MIDPOINT) (SHAPE)
experiment -0.314888109220462 -1.29948251874868
Paths:
[From] [STEPS] [Skew] [NON-LINEAR] [SIGMA FACTOR]
0 50 0.5 0 0
Since the idea is to produce a simulation as close as possible to real data, is it a good idea to set lib.scale = 0
?
Using the above parameters, I create plots for a gene expressed in around 5000 cells (script attached). As you suggested, I plotted the CellsMeans
; indeed, the patterns are more evident now.
> Gene Base Path1LocalExp Path2LocalExp Path3LocalExp Path4LocalExp
Gene50 Gene50 1.72121 0.7448674 0.1353784 0.2917269 1.667929
Do you think this approach is robust enough to be able to annotate patterns like this, or is there something else I can manage with path.nonlinearProb
and path.sigmaFac
to introduce and track changes in a path? As stated above, the goal is to simulate and annotate trends like increases, decreases, convex and concave shapes, etc.
example_script.txt
I want to keep track of the changes in the paths along the steps. That's why I put one path after another, so I can know precisely when the change happens in a path. Correct me if I'm wrong, but based on the documentation, I believe setting the
path.nonlinearProb
andpath.sigmaFac
will surely introduce local variations in a path. However, the record of where the change is introduced is not kept; only the magnitude and probability of the change are. For my specific analysis, I want to know the locations of the change. The goal is to simulate and annotate trends like increases, decreases, convex and concave shapes, etc.
Yes, I think you are right. The information about the base expression at each step along a path currently isn't stored in the file object which makes it difficult to check exactly how individual genes are behaving.
Since the idea is to produce a simulation as close as possible to real data, is it a good idea to set
lib.scale = 0
?
This was just a suggestion to make it easier to see what was happening with genes along the path. For a real simulation you would definitely want variation in the library size.
Do you think this approach is robust enough to be able to annotate patterns like this, or is there something else I can manage with
path.nonlinearProb
andpath.sigmaFac
to introduce and track changes in a path? As stated above, the goal is to simulate and annotate trends like increases, decreases, convex and concave shapes, etc. example_script.txt
If you are only simulating linear changes it should be enough to look at the recorded path DE factors as the ground truth of how a gene changes (like you did in the original post). However, these changes might not be visible in the final counts due to noise introduced by other parts of the simulation.
If it would be helpful, it would be a relatively minor change to add the factors for every step along the paths to the exported object. This only really provides more information for non-linear changes though.
Thank you for the response; it answers all of my questions.
Hello Splatter team,
Thank you for creating an excellent tool and actively maintaining it. I want to simulate genes which change their expression as a function of "Step" when performing the simulation using the "Path" method. The goal is to generate sets of genes with specific trends with "Step". Considering the following setup:
Based on what is described in Issue #57, I expect a gene with an overall increasing trend to exhibit an increasing trend across all paths and to have an increasing fold-change in successive paths.
I calculated the change in expression for each path and their "localDEFac" as follows:
Now for the following genes
Based on their expression, I would expect different patterns. For example:
However, with such calculations, it's challenging to confidently identify the exact trend for all genes. I think this approach only holds for genes with a decreasing trend along the "Steps." The calculated spline does not accurately follow the expected trend.
Do you think it's possible to simulate such data in Splatter? I am interested in simulating/filtering genes that follow steps like increasing, decreasing, convex, and concave patterns. I would appreciate your guidance in the right direction. Thank you in advance.
Additional Info: