Open ChristophH opened 1 year ago
Great question. Yeah, so it’s admittedly been a while since writing this, but looking at the code & comments, I think the reasoning is at follows…
Thanks for the explanation. However, I was just trying to validate that behavior using the .sim function with some toy parameters, but could not reproduce the DP density curves. [Crowell et al., Nat Com, 2020]
For DE and DM the curves look as expected. But for DP, only group A shows the bimodal distribution. As a result, the observed logFC is always negative, even when the input logFC is positive.
(For details see this R notebook)
I understand that .sim is an internal function, but I'm going down this rabbit hole as I'm trying to understand why my muscat generated data for DP genes does not look as expected.
Edit: Now I've used muscat::simData
to generate data and illustrate the problem. The R notebook is here.
The figure below shows one high and one low logFC gene per differential category. DP shows opposite logFC sign compared to DE and DM. Also, DP looks like DM with just one bimodal distribution.
Am I doing something wrong, or missing something?
Short recap:
To me, this seems like unexpected behavior (a bug?), or am I doing it wrong? Please advice.
Hi,
Thanks for creating and maintaining muscat, and all your hard work that goes into method development and benchmarking!
I think there is a bug/inconsistency in the sign of the logFC of simulated data. When simulated with the 'dp' category, the logFC is applied in the opposite direction as, for example, with the 'de' or 'dm' category.
In .sim() you can see how either '-lfc', or 'lfc' is used with group 1
https://github.com/HelenaLC/muscat/blob/10cd3262bd5bc45930d62dc8f7d8b97629c3bbe4/R/utils-simData.R#L329-L352
But maybe I'm missing something.
Cheers, Christoph