Closed AlexWeitzel closed 3 months ago
Hey there,
Thanks for pointing this out and providing the line of code that filters the inf
s. I'm not sure why you'd be getting them in the output to begin with. Like you said, they are all -inf
for the M0 marginal likelihood column, and that causes BFs 1 and 3 to then be inf
.
One thing to check would be the original elem_lik.txt
for a given batch to see what the M0 marginal likelihood is there. For instance, the first line in your second screenshot has an id of 182-5
. If you went into the phyloacc-job-files/phyloacc-output/182-phyloacc-*-out/
directory and looked at the 182_elem_lik.txt
file, what is the M0 entry here for element id 5? Note that this file has a slightly different format than the final elem_lik file, so the M0 likelihood is the column labeled "loglik_Null_W
". If this is not also inf
, then something is going wrong with the phyloacc_post.py
script. Otherwise, we'd have to ask @xyz111131 or @HanY-H why this model is producing an inf
.
Hello,
Thanks for your response. The loglik_Null
is -inf
the corresponding logBF1
is inf
in the 182_elem_lik.txt
file.
Hope this helps, Alex
Ok, so its likely that those elements are just failing to converge. I'm not sure why this happens for a particular element, but I think it is expected to happen sometimes, resulting in the -inf
s in the model fits.
But the phyloacc_post.py
script should definitely handle the inf
s better by reporting how many there are and excluding them from the list when plotting, which seems to break matplotlib. I will keep this issue open as we address that.
Hello, just an update. I just encountered a scenario where the logbf2
was also inf
. So it appears logbf1
, logbf2
, and logbf3
can all give inf
values. I just wanted to amend my previous post where I had only seen logbf1
and logbf3
as inf. The same fix by filtering out inf
values applies.
I have also taken a look at some of the fasta files that have generated these errors, and by eye they seem to contain significant gapped regions if that gives any clue as to how this is occurring.
Thanks, Alex
Ok, thanks for the update. I've got a list of bug fixes that I'll be implementing soon, so I'll be sure to include this.
Hello,
When calling phyloacc_post.py I receive the following error.
This appears to be due to 22 out of ~8,000 loci having
-inf
values formarginal.likelihood.m0
, andinf
values forlogbf1
andlogbf3
I was able to get around this by adding
filtered_bf1s = [v for v in bf1s if not math.isinf(v)]
and then plottingfiltered_bf1s
instead ofbf1s
(and doing the same for bf3s) in phyloacc_lib/plot.py, but I was wondering if you knew why the inf values were occurring in the first place.Thank you, Alex