Closed josephwb closed 10 years ago
Implemented for speciation-extinction. Soon to be for trait model as well.
Hmm. running longer tests, runs diverge after a few thousand generations. Not sure what is going on...
BTW I pulled and compiled but had no issues - the random seed worked fine (tested it up to 10^6 gens). I assume you have implemented yet for traits but worked fine in speciationextinction.
On Nov 17, 2013, at 11:57 AM, Joseph W. Brown wrote:
Hmm. running longer tests, runs diverge after a few thousand generations. Not sure what is going on...
— Reply to this email directly or view it on GitHub.
Implemented for both now, but both diverge for me. Here are some results for the whale tree with seed = 12345 and printFreq = 1000.
Run 1: Generation lnLik N_shifts LogPrior acceptRate 1 -328.788 0 0.120855 Accp: 0 1001 -267.902 1 -0.226387 Accp: 0.537 2001 -267.601 1 0.820076 Accp: 0.502 3001 -265.772 2 1.52255 Accp: 0.478 4001 -268.396 2 0.780745 Accp: 0.559 5001 -265.694 2 -0.223779 Accp: 0.547 6001 -272.976 1 1.36231 Accp: 0.577 7001 -264.579 1-0.0996446 Accp: 0.472 8001 -264.161 2 -3.35758 Accp: 0.525 9001 -263.454 1-0.0102534 Accp: 0.522 10001 -264.118 1 0.666584 Accp: 0.527
Run 2: Generation lnLik N_shifts LogPrior acceptRate 1 -328.788 0 0.120855 Accp: 0 1001 -267.902 1 -0.226387 Accp: 0.537 2001 -267.601 1 0.820076 Accp: 0.502 3001 -265.772 2 1.52255 Accp: 0.478 4001 -259.861 2 -4.16825 Accp: 0.532 <-- Diverged 5001 -266.556 1 0.747653 Accp: 0.5 6001 -267.167 1 0.479961 Accp: 0.535 7001 -270.296 1 -0.506432 Accp: 0.455 8001 -262.076 2 -2.80532 Accp: 0.528 9001 -260.871 2 -0.257305 Accp: 0.527 10001 -266.33 1 0.733088 Accp: 0.529
Diverges at random generations (i.e. not consistent across replicate runs).
Maybe another OS-specific issue?
I just tried your seed & example, worked fine.
did you make clean before rebuilding (i assume you did, but weirdly, i got seg faults after pulling your last commit until I cleaned everything out again! make clean fixed it).
On Nov 17, 2013, at 12:59 PM, Joseph W. Brown wrote:
Implemented for both now, but both diverge for me. Here are some results for the whale tree with seed = 12345 and printFreq = 1000.
Run #1: Generation lnLik N_shifts LogPrior acceptRate 1 -328.788 0 0.120855 Accp: 0 1001 -267.902 1 -0.226387 Accp: 0.537 2001 -267.601 1 0.820076 Accp: 0.502 3001 -265.772 2 1.52255 Accp: 0.478 4001 -268.396 2 0.780745 Accp: 0.559 5001 -265.694 2 -0.223779 Accp: 0.547 6001 -272.976 1 1.36231 Accp: 0.577 7001 -264.579 1-0.0996446 Accp: 0.472 8001 -264.161 2 -3.35758 Accp: 0.525 9001 -263.454 1-0.0102534 Accp: 0.522 10001 -264.118 1 0.666584 Accp: 0.527
Run #2: Generation lnLik N_shifts LogPrior acceptRate 1 -328.788 0 0.120855 Accp: 0 1001 -267.902 1 -0.226387 Accp: 0.537 2001 -267.601 1 0.820076 Accp: 0.502 3001 -265.772 2 1.52255 Accp: 0.478 4001 -259.861 2 -4.16825 Accp: 0.532 <-- Diverged 5001 -266.556 1 0.747653 Accp: 0.5 6001 -267.167 1 0.479961 Accp: 0.535 7001 -270.296 1 -0.506432 Accp: 0.455 8001 -262.076 2 -2.80532 Accp: 0.528 9001 -260.871 2 -0.257305 Accp: 0.527 10001 -266.33 1 0.733088 Accp: 0.529
Diverges at random generations (i.e. not consistent across replicate runs).
Maybe another OS-specific issue?
— Reply to this email directly or view it on GitHub.
Yeah, I always clean and rebuild it all to avoid those weird behaviours. Problem persists...
I will try this on my wife's machine (different OS).
Can you test if has something to do with BAMM versus the RNG or MbRandom class itself, like have dummy function in main that just calls the RNG a few million times to see if still diverges with no calls to BAMM associated functions?
On Nov 17, 2013, at 1:09 PM, Joseph W. Brown wrote:
Yeah, I always clean and rebuild it all to avoid those weird behaviours. Problem persists...
— Reply to this email directly or view it on GitHub.
Yeah, I'll do that to localize the issue.
BTW, it works correctly on my wife's 10.8.5 machine.
So, uniformRv() runs consistently (tested up to 10MIL), and is the same across OSs.
The diverging of BAMM results seems to happen (for me) after a few tens of thousand generations; i.e. not right away. Are there proposals/operations that occur very infrequently?
ok, this is weird. I wonder how we might be able to profile this.
I have to wonder if this could be some weird numerical thing. For example, maybe I have an inconsistent calculation (or improperly initialized variable) or something that is fine on my compiler, but for you it leads to very slight changes in the likelihoods (some address overhang or something in dynamic memory) that ultimately cause things to diverge.
Maybe try setting sampleFromPriorOnly = 1 and run for a million gens. This may help localize things.
I could see this being a headache to diagnose, especially if I can't replicate on my machine.
On Nov 17, 2013, at 2:16 PM, Joseph W. Brown wrote:
So, uniformRv() runs consistently (tested up to 10MIL), and is the same across OSs.
The diverging of BAMM results seems to happen (for me) after a few tens of thousand generations; i.e. not right away. Are there proposals/operations that occur very infrequently?
— Reply to this email directly or view it on GitHub.
Well, this seems to have been 'solved' on non-10.9 macs. I will therefore close it now. I am tempted to open a new issue "why things break on 10.9", but will resist for the time being. This, of course, should not be taken lightly; I imagine a lot of people (the majority?) have upgraded, or will soon do so, to 10.9, so this problem will not go away.
Carlos - if you have 10.9 on one of your machines, maybe you can try some debugging on this if we can replicate it. I am definitely concerned with the RNG performance and it makes me wonder what could possibly be going on.
On Nov 17, 2013, at 9:46 PM, Joseph W. Brown wrote:
Well, this seems to have been 'solved' on non-10.9 macs. I will therefore close it now. I am tempted to open a new issue "why things break on 10.9", but will resist for the time being. This, of course, should not be taken lightly; I imagine a lot of people (the majority?) have upgraded, or will soon do so, to 10.9, so this problem will not go away.
— Reply to this email directly or view it on GitHub.
One worry I have is the reference of "two" random number seeds (from MbRandom.cpp), emphasized as 'TWO' below:
/ This function sets the TWO seeds for the random number generator, using the current time. \brief Initializes random number seeds. \return This function does not return anything. \throws Does not throw an error. / void MbRandom::setSeed(void) { seed = (long int)( time( 0 ) ); }
/ This function sets the TWO seeds for the random number generator. If only one starting value is given we set the two seeds by using the two least signficant bytes as one seed and the two most significant bytes shifted to the right as the second seed. \brief Initializes random number seeds. \return This function does not return anything. \throws Does not throw an error. / void MbRandom::setSeed(long int s) { seed = s; }
/ This function gets the TWO seeds from the random number generator. \brief Return the random number seeds. \param i1 [in/out] the first seed \param i2 [in/out] the second seed \return This function does not return anything. \throws Does not throw an error. / long int MbRandom::getSeed(void) { return seed; }
I am not sure where the '2nd' seed comes from, or how to manually supply it myself.
weird - but shouldn't this have come up in your tests of the RNG outside of the bamm code?
On Nov 17, 2013, at 10:06 PM, Joseph W. Brown wrote:
One worry I have is the reference of "two" random number seeds (from MbRandom.cpp), emphasized as 'TWO' below:
/ This function sets the TWO seeds for the random number generator, using the current time. \brief Initializes random number seeds. \return This function does not return anything. \throws Does not throw an error. / void MbRandom::setSeed(void) { seed = (long int)( time( 0 ) ); }
/ This function sets the TWO seeds for the random number generator. If only one starting value is given we set the two seeds by using the two least signficant bytes as one seed and the two most significant bytes shifted to the right as the second seed. \brief Initializes random number seeds. \return This function does not return anything. \throws Does not throw an error. / void MbRandom::setSeed(long int s) { seed = s; }
/ This function gets the TWO seeds from the random number generator. \brief Return the random number seeds. \param i1 [in/out] the first seed \param i2 [in/out] the second seed \return This function does not return anything. \throws Does not throw an error. / long int MbRandom::getSeed(void) { return seed; }
I am not sure where the '2nd' seed comes from, or how to manually supply it myself.
— Reply to this email directly or view it on GitHub.
Even though the documentation of the MbRandom code mentions two seeds, I have yet to find a way to supply more than a single seed.
It looks like you supply the two seeds in a single long int, using bitshifting to place them. Bitshifting allows you to move bits around a memory location, so they probably assign the seed the first number, then shift those bits to the left (filling the empty spaces with 0), and then adding the second number. Now the single long int contains "two" seeds.
Is this where things are diverging for me? Should we implement an alternative way to set the second seed?
I admit i'm confused by the documentation and am not sure its correct for that class. Here's the core RNG algorithm in the MbRandom class: (most of the random numbers are based on simple transformations of the uniform RNG below). I see nothing about bit shifting or 2 seeds! This looks like a pretty simple generator.
/*!
OK, that code post looks terrible. in any event go look at MbRandom::uniformRV(void)
Joseph, Carlos - one thing to try would be to make a dummy/experimental function for uniformRV in MbRandom that uses another uniformRV generator (maybe one from std). This would be a trivial change and would be good to know if it fixes the behavior joseph is observing. If it doesn't, that suggests that it is a problem with something else.
The fact that the runs diverge does not mean, necessarily, that it is the fault of the RNG. If there is some weird numerical behavior going on because of (say) something not properly initialized, then the runs could diverge even if the same sequence of random numbers is called.
On Nov 18, 2013, at 10:33 AM, Joseph W. Brown wrote:
Is this where things are diverging for me? Should we implement an alternative way to set the second seed?
— Reply to this email directly or view it on GitHub.
Yeah, I confirmed that uniformRV runs identically across OSs, so that is not it.
OK, this is a major issue (in my opinion). We may actually have to figure out the exact function call that causes these to diverge. This could be a pain.
On Nov 18, 2013, at 10:48 AM, Joseph W. Brown wrote:
Yeah, I confirmed that uniformRV runs identically across OSs, so that is not it.
— Reply to this email directly or view it on GitHub.
Agreed. I am starting a new issue (since the code itself is complete).
Dan -- can you get Pascal to try running some replicate analyses with a constant seed value? He is the only other I know that is using 10.9. Just want to make sure that this is not just my problem alone.
I ran bamm for 2,000,000 generations and didn't observe any divergence. I'm using OS X 10.7.5.
For replication purposes. Default to clock (if provided seed == -1), otherwise use user-provided seed.