Closed kpj closed 4 years ago
Problem: negative mu
.
Solution: dynamic mean for the response (+ response-independent offset).
Problem: count distribution unrealistic Solution: initialize source nodes with high dispersion
Problem: does not generalize to real data Solution: 😕
Problem: can lead to extreme response Problem: mean response depends on coupling to parents
Advantage: count distribution similar to real data Problem: source nodes habe unrealistic count distribution (investigate real data) Solution: introduce artificial source node connected to original sources
How to compare simulated and real data:
https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa105/5739438
Current approach:
Simulate with identity and push minimum counts to 1.
Possible adjustment:
Only push minimum to 1, if minimum is less than zero. (this would be in agreement with the current solver)
Adjusted identity link function ...
Idea 1
beta > 0
describes the relative change.0.5
corresponds to halving and2
to doubling the expression levels. This is problematic because it requires a transformation of causal effects which is non-trivial (but possibly somehow doable?).Idea 2
beta
can be both positive and negative. Counts are propagated by multiplying beta with mean-standardized counts and adding noise. This is problematic because standardizing might introduce artefacts and can lead tomu < 0
(which yieldsNaN
counts).Idea 3
Use a mean function for
mu
ofrnbinom
. This requires an appropriate link function during the regression.