rambaut / beast-mcmc

Automatically exported from code.google.com/p/beast-mcmc
0 stars 0 forks source link

BirthDeathEpidemiologyModel bug #569

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
need to review the equation for res = ... in line 194 in 
BirthDeathSerialSamplingModel. It is getting INF by given the xml

==========================

I specify the epidemiological model in Beauty (for an alignment we used in our 
R0  paper), but then the speciation likelihood is always -Inf.

Tanja

          BEAST v1.7.0 Prerelease r4563, 2002-2012
       Bayesian Evolutionary Analysis Sampling Trees
                 Designed and developed by
   Alexei J. Drummond, Andrew Rambaut and Marc A. Suchard

               Department of Computer Science
                   University of Auckland
                  alexei@cs.auckland.ac.nz<mailto:alexei@cs.auckland.ac.nz>

             Institute of Evolutionary Biology
                  University of Edinburgh
                     a.rambaut@ed.ac.uk<mailto:a.rambaut@ed.ac.uk>

              David Geffen School of Medicine
           University of California, Los Angeles
                     msuchard@ucla.edu<mailto:msuchard@ucla.edu>

                Downloads, Help & Resources:
                 http://beast.bio.ed.ac.uk

Source code distributed under the GNU Lesser General Public License:
            http://code.google.com/p/beast-mcmc

                     BEAST developers:
Alex Alekseyenko, Trevor Bedford, Erik Bloomquist, Joseph Heled,
Sebastian Hoehna, Denise Kuehnert, Philippe Lemey, Wai Lok Sibon Li,
Gerton Lunter, Sidney Markowitz, Vladimir Minin, Michael Defoin Platel,
          Oliver Pybus, Chieh-Hsi Wu, Walter Xie

                         Thanks to:
    Roald Forsberg, Beth Shapiro and Korbinian Strimmer

Random number seed: 1324563295127

Loading additional development parsers from development_parsers.properties, 
which is additional set of parsers only available for development version ...
Parsing XML file: test.xml
  File encoding: MacRoman
Looking for plugins in /Users/tanja/Beast/beast-mcmc-read-only/plugins
Read alignment: alignment
  Sequences = 34
      Sites = 1551
   Datatype = nucleotide
Site patterns 'patterns' created from positions 1-1551 of alignment 'alignment'
  pattern count = 292
Using epidemiological parameterization of Stadler et al (2011) : Estimating the 
basic reproductive number from viral sequence data, Mol.Biol.Evol., doi: 
10.1093/molbev/msr217, 2011
Creating the tree model, 'treeModel'
  initial tree topology = ((((((((((((((((((((((((r10683886dt732522,r9669342dt732526),r9669638dt732529),(r10652463dt732624,r12284256dt732636)),(r10153761dt732594,r15280801dt732886)),r5506089dt732358),(((r4228404dt732107,r5073850dt732095),r5258831dt732053),r5062763dt732047)),r3021431dt731719),r1470425dt731446),(r12258435dt731422,r1464889dt731516)),r675412dt731176),r661689dt730677),r494013dt730481),((r14896566dt730573,r662413dt730565),r503711dt730536)),r662477dt730256),r20490234dt730124),r1628227dt730012),r21554982dt729885),r503423dt729884),r21553623dt729719),(r19793779dt729097,r5252289dt729163)),r20490834dt728911),r19677881dt728772),r17616815dt728685),r15912268dt728491)
  tree height = 4577.244801796485
Using discretized relaxed clock model.
  over sampling = 1
  parametric model = logNormalDistributionModel
   rate categories = 1
Creating state frequencies model 'frequencies': Initial frequencies = {0.25, 
0.25, 0.25, 0.25}
Creating HKY substitution model. Initial kappa = 2.0
Creating site model.
  4 category discrete gamma with initial shape = 0.5
TreeLikelihood(treeModel) using Java nucleotide likelihood core
  Ignoring ambiguities in tree likelihood.
  With 292 unique site patterns.
Branch rate model used: discretizedBranchRates
Failed to load native NucleotideLikelihoodCore in : 
/Users/tanja/Beast/beast-mcmc-read-only
Using Java nucleotide likelihood core because java.lang.UnsatisfiedLinkError: 
no NucleotideLikelihoodCore in java.library.path
Creating swap operator for parameter branchRates.categories (weight=10.0)
Creating the MCMC chain:
  chainLength=10000000
  autoOptimize=true
  autoOptimize delayed for 100000 steps
Error running file: test.xml
The initial model is invalid because state has a zero probability.

If the log likelihood of the tree is -Inf, his may be because the
initial, random tree is so large that it has an extremely bad
likelihood which is being rounded to zero.

Alternatively, it may be that the product of starting mutation rate
and tree height is extremely small or extremely large.

Finally, it may be that the initial state is incompatible with
one or more 'hard' constraints (on monophyly or bounds on parameter
values. This will result in Priors with zero probability.

The individual components of the posterior are as follows:
The initial posterior is zero:
  DistributionLikelihood=-1.8654,
  DistributionLikelihood=0.0,
  DistributionLikelihood=-0.3069,
  DistributionLikelihood=0.0986,
  DistributionLikelihood=-0.9189,
  DistributionLikelihood=-230.2585,
  DistributionLikelihood=-1.8654,
  DistributionLikelihood=-4.6052,
  DistributionLikelihood=0.0,
  SpeciationLikelihood(speciationLikelihood)=-Inf,
  TreeLikelihood(treeLikelihood)=-45287.8362

Original issue reported on code.google.com by dong.w.xie@gmail.com on 1 Feb 2012 at 10:31

Attachments:

GoogleCodeExporter commented 9 years ago
In the screenshot t is way too large (48000) so exp(c1 * t) overflows. The 
issue is that the initial t needs to be specified so that it is greater than 
the root time but only a little bit. Perhaps the initial value needs to be set 
as rootHeight + delta? 

Original comment by ramb...@gmail.com on 2 Feb 2012 at 7:40

GoogleCodeExporter commented 9 years ago
This is very tricky part:

origin (t here) = avgInitialRootHeight * 1.1, which is not big increase during 
auto-rescaling.  

But the problem is that avgInitialRootHeight = options.maximumTipHeight * 10.0; 
from BEAUti estimation in "case TIP_CALIBRATED" of 
ClockModelOptions.calculateInitialRootHeightAndRate(partitions). 

We spent a lot of time on this, but never getting any good estimation of init 
root height for all cases.

Would you have any simple solution?

Original comment by dong.w.xie@gmail.com on 3 Feb 2012 at 4:05

GoogleCodeExporter commented 9 years ago
The problem is the entire scale of the numbers. If the c1 * t > 700 then it 
will over flow. The solution may be to scale t down by using a larger time 
unit? 

Original comment by ramb...@gmail.com on 3 Feb 2012 at 7:07

GoogleCodeExporter commented 9 years ago
just checked the critical equation:
res = 2.0 * (1.0 - c2 * c2) + Math.exp(-c1 * t) * (1.0 - c2) * (1.0 - c2) + 
Math.exp(c1 * t) * (1.0 + c2) * (1.0 + c2);

So, it is correct. However we could rewrite it:
res = Math.exp(c1 * t)  * ( Math.exp(-c1 * t) * (1.0 - c2) +  (1.0 + c2) ) * ( 
Math.exp(-c1 * t) * (1.0 - c2) +  (1.0 + c2) );
meaning if we operate directly in logspace:

res = c1*t+2* Log ( Math.exp(-c1 * t) * (1.0 - c2) + * (1.0 + c2) );

so we avoid any Math.exp(c1*t) !

Original comment by tanja...@googlemail.com on 3 Feb 2012 at 3:40

GoogleCodeExporter commented 9 years ago
sorry last row is a typo, correct is:
res = c1*t+2* Log ( Math.exp(-c1 * t) * (1.0 - c2) + (1.0 + c2) );

Original comment by tanja...@googlemail.com on 3 Feb 2012 at 3:41

GoogleCodeExporter commented 9 years ago

Original comment by dong.w.xie@gmail.com on 6 Feb 2012 at 10:20