Closed Jammy2211 closed 1 year ago
Can you give a self-contained example ? and maybe provide the contents of those dictionaries?
I am not sure I can see the problem as the dynamic sampler is explicitly tested here https://github.com/joshspeagle/dynesty/blob/master/tests/test_resume.py
Also the default time between checkpoints is 60 seconds. Are you running your code at least that ?
Assuming no updates, I'll be closing this issue, as I'm not sure there is a bug there.
The following example does not produce a checkpoint file for the dynamic sampler for me, but does for the static sampler:
import numpy as np
from dynesty.dynesty import NestedSampler
from dynesty.dynesty import DynamicNestedSampler
def fitness_function(model):
return 100.0 * np.random.random(1)[0]
def prior_transform(cube):
return cube
sampler = NestedSampler(
loglikelihood=fitness_function,
prior_transform=prior_transform,
ndim=3,
)
sampler.run_nested(
maxcall=10,
print_progress=False,
checkpoint_file="static.savestate",
)
sampler = DynamicNestedSampler(
loglikelihood=fitness_function,
prior_transform=prior_transform,
ndim=3,
)
sampler.run_nested(
maxcall=10,
print_progress=False,
checkpoint_file="dynamic.savestate",
)
Whilst the LH function is somewhat broken, the same behavior is seen for all my science model-fits so the behaviour is not related to how I defined the fitness_function (and checkpointing works for the static sampler anyway).
As I was suspecting before your test does not run long enough for the checkpointing to kick in. With this modification the savefile is created
import numpy as np
import time
from dynesty.dynesty import NestedSampler
from dynesty.dynesty import DynamicNestedSampler
def fitness_function(model):
print('sleeping')
time.sleep(.01)
return 100.0 * np.random.random(1)[0]
def prior_transform(cube):
return cube
sampler = DynamicNestedSampler(
loglikelihood=fitness_function,
prior_transform=prior_transform,
ndim=3,
)
sampler.run_nested(maxcall=10,
print_progress=False,
checkpoint_file="dynamic.savestate",
checkpoint_every=1)
Is there any way to make it so that checkpointing does not depend on the clocktime?
Its hard to generalize checkpointing settings to many different modeling problems!
EDIT: Also, why does DynestStatic not suffer this issue if its to do with clocktime?
Regarding the dynamic static showing this issue and not the static, I was just investigating that, and I've found that I always save the checkpoint at the very end no matter the timing. I have addressed that here for the dynamicsampler ef310152cfd920290261c94a38da62ce9e3ec0e4. Regarding the checkpointing that is not timing dependent, in my opinion the time based behaviour was the most useful for cases where running on HPC etc, but I'm happy to hear other suggestions on this.
If the dynamic sampler now checkpoints at the end of a run, then everything is good!
I agree that any choice of checkpoint frequency (time, samples, accepted samples, etc) will have pros and cons for different use-cases.
Thanks!
I've just released the version (v2.0.2) that includes the fix to the dynamic sampler that forces it to save things in the end of the run. Keep in mind that the checkpointing is not a substitute for persistence. I.e. there are no guarantees on being able to read checkpoint files using different dynesty version from the one used to create the file. We won't break it on purpose, but likely the next major release 2.1 won't be able to deal with 2.0 files.
Brilliant, thank you!
Keep in mind that the checkpointing is not a substitute for persistence. I.e. there are no guarantees on being able to read checkpoint files using different dynesty version from the one used to create the file. We won't break it on purpose, but likely the next major release 2.1 won't be able to deal with 2.0 files.
I wouldn't expecting that it would! I'm pretty famous for breaking backwards compatibility for my userbase 🤣 .
Dynesty version 2.0.1
Describe the bug
When I run a model-fit via
StaticSampler
, the inclusion of a checkpoint file viacheckpoint_file=checkpoint_file
creates the check point file and I can resume the run viaStaticSampler.restore(fname=self.checkpoint_file)
.If I swap the
StaticSampler
out for theDynamicNestedSampler
the check point file is not created and resuming does not work.Everything between the two runs is identical except the sampler.
Setup
Dynesty output
148it [00:03, 46.43it/s, bound: 0 | nc: 9 | ncall: 920 | eff(%): 16.087 | loglstar: -inf < 11270.460 < inf | logz: 11267.556 +/- nan | dlogz: 557.749 > 0.059] ^C
Bug
N/A
Additional context
N/A