LinguaPhylo / linguaPhylo

LinguaPhylo: A language for describing, visualising and simulating probabilistic phylogenetic models
GNU Lesser General Public License v3.0
18 stars 7 forks source link

Log variables not being saved correctly #509

Open zjzxiaohei opened 2 months ago

zjzxiaohei commented 2 months ago

I am using LPhy with the phylonco package and the saved logged variables file has negative numbers where the result is not negative.

My LPhy script is:

n = 2;
l = 2;

Θ ~ LogNormal(meanlog=-2.0, sdlog=1.0);
ψ ~ Coalescent(n=n, theta=Θ);
π ~ Dirichlet(conc=[3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0]);
rates ~ Dirichlet(conc=[1.0, 2.0, 1.0, 1.0, 2.0, 1.0]);
Q = gt16(freq=π, rates=rates); // construct the GT16 instantaneous rate matrix
A ~ PhyloCTMC(L=l, Q=Q, tree=ψ, dataType=phasedGenotype());
epsilon = 0.06; // try 0.5
delta = 0.5;
meanT = 2.3; // average coverage = lognormal(meanT, sdT)
sdT = 0.1;
meanV = 0.1; // variance coverage = lognormal(meanV, sdV)
sdV = 0.05;
meanS = 0.04; // cell-specific scaling
sdS = 0.001;
t ~ LogNormal(meanlog= meanT, sdlog= sdT);
v ~ LogNormal(meanlog= meanV, sdlog= sdV);
s ~ LogNormal(meanlog= meanS, sdlog= sdS, replicates=n);
alpha ~ Ploidy(l= l, n= n, delta= delta);
cov ~ CoverageModel(alpha= alpha, t= t, v= v, s= s);

w1 = 10.0;

r1 ~ ReadCountModel(D=A, epsilon=epsilon, alpha=alpha, coverage=cov, w=w1);
[readCountVars.txt](https://github.com/user-attachments/files/16405540/readCountVars.txt)
[readCountVarsTrue.txt](https://github.com/user-attachments/files/16405541/readCountVarsTrue.txt)

Variables files saved from the LPhyStudio Menu is attached as "readCountVars.txt"

The true variables as shown in the Variable Summary panel in LPhyStudio is attached as "readCountVarsTrue.txt"

walterxie commented 2 months ago
  1. can you reload your file?
  2. look at any lphy examples that have data and model block, and tidy up your code. It is difficult to read.
zjzxiaohei commented 2 months ago

readCountVars.txt readCountVarsTrue.txt

zjzxiaohei commented 2 months ago
n = 2;
l = 2;
w1 = 10.0;
epsilon = 0.6; // try 0.5
delta = 0.5;
meanT = 2.3; // average coverage = lognormal(meanT, sdT)
sdT = 0.1;
meanV = 0.1; // variance coverage = lognormal(meanV, sdV)
sdV = 0.05;
meanS = 0.04; // cell-specific scaling
sdS = 0.001;

Θ ~ LogNormal(meanlog=-2.0, sdlog=1.0);
ψ ~ Coalescent(n=n, theta=Θ);
π ~ Dirichlet(conc=[3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0,3.0]);
rates ~ Dirichlet(conc=[1.0, 2.0, 1.0, 1.0, 2.0, 1.0]);
Q = gt16(freq=π, rates=rates); // construct the GT16 instantaneous rate matrix
A ~ PhyloCTMC(L=l, Q=Q, tree=ψ, dataType=phasedGenotype());
t ~ LogNormal(meanlog= meanT, sdlog= sdT);
v ~ LogNormal(meanlog= meanV, sdlog= sdV);
s ~ LogNormal(meanlog= meanS, sdlog= sdS, replicates=n);
alpha ~ Ploidy(l= l, n= n, delta= delta);
cov ~ CoverageModel(alpha= alpha, t= t, v= v, s= s);
r1 ~ ReadCountModel(D=A, epsilon=epsilon, alpha=alpha, coverage=cov, w=w1);
walterxie commented 2 months ago

I still do not understand your problem. After asking Kylie, I found few problems in your code:

  1. PloidyModel missing setter;
  2. why using Multinomial, not Binomial,
Value<Integer[]> alpha1 = multinomialAlpha.sample();
if (alpha1.value()[0] == 1) {
     alp[i][j] = 1;
} else alp[i][j] = 2;

The above code seems suspicious to put inside a nested loop.

In addition, few more minor issues:

i. too many empty lines above RandomVariable<Integer[][]> sample(); ii. no comment, difficult to read; iii. you still did not put the lphy script into data and model block.

I have explained all these details to Kylie.