bilby-dev / bilby

A unified framework for stochastic sampling packages and gravitational-wave inference in Python. Note that we are currently transitioning from git.ligo.org/lscsoft/bilby, please bear with us!
https://bilby-dev.github.io/bilby/
MIT License
57 stars 63 forks source link

Numpy error in dynesty plotting #758

Open bilby-bot opened 1 year ago

bilby-bot commented 1 year ago

In GitLab by @git.ligo:alexandresebastien.goettel on Aug 28, 2023, 14:12

Some of my runs are running into an issue in the middle of sampling, crashing the entire thing: ---

00:03 bilby INFO    : Run interrupted by alarm signal 14: checkpoint and exit on 77
00:03 bilby INFO    : Written checkpoint file working/GW200316_215756/Prod0/result/Prod0_data0_1268431094-157825_analysis_H1L1V1_par0_resume.pickle
00:03 bilby INFO    : Starting to close worker pool.
00:03 bilby INFO    : Finished closing worker pool.
00:04 bilby WARNING : Unexpected error <built-in method __deepcopy__ of numpy.ndarray object at 0x148521eb7a50> returned a result with an error set in dynesty plotting. Please report at git.ligo.org/lscsoft/bilby/-/issues

The full traceback is below:

Traceback (most recent call last):
  File "/home/ubuntu/Software/bilby_pipe/install/bin/bilby_pipe_analysis", line 33, in <module>
    sys.exit(load_entry_point('bilby-pipe==0.0.0', 'console_scripts', 'bilby_pipe_analysis')())
  File "/home/ubuntu/Software/bilby_pipe/bilby_pipe/data_analysis.py", line 379, in main
    analysis.run_sampler()
  File "/home/ubuntu/Software/bilby_pipe/bilby_pipe/data_analysis.py", line 264, in run_sampler
    self.result = bilby.run_sampler(
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/site-packages/bilby/core/sampler/__init__.py", line 234, in run_sampler
    result = sampler.run_sampler()
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/site-packages/bilby/core/sampler/base_sampler.py", line 96, in wrapped
    output = method(self, *args, **kwargs)
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/site-packages/bilby/core/sampler/dynesty.py", line 517, in run_sampler
    out = self._run_external_sampler_with_checkpointing()
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/site-packages/bilby/core/sampler/dynesty.py", line 635, in _run_external_sampler_with_checkpointing
    self.sampler.run_nested(**sampler_kwargs)
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/site-packages/dynesty/sampler.py", line 941, in run_nested
    for it, results in enumerate(
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/site-packages/dynesty/sampler.py", line 774, in sample
    u, v, logl, nc = self._new_point(loglstar_new)
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/site-packages/dynesty/sampler.py", line 385, in _new_point
    u, v, logl, nc, blob = self._get_point_value(loglstar)
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/site-packages/dynesty/sampler.py", line 368, in _get_point_value
    self._fill_queue(loglstar)
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/site-packages/dynesty/sampler.py", line 361, in _fill_queue
    self.queue = list(mapper(evolve_point, args))
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/multiprocessing/pool.py", line 473, in _map_async
    self._check_running()
  File "/home/ubuntu/.conda/envs/local-pe/lib/python3.9/multiprocessing/pool.py", line 350, in _check_running
    raise ValueError("Pool not running")
ValueError: Pool not running
bilby-bot commented 1 year ago

In GitLab by @git.ligo:colm.talbot on Aug 29, 2023, 19:50

Hi @git.ligo:alexandresebastien.goettel what version of Bilby/dynesty are you using?

bilby-bot commented 1 year ago

In GitLab by @git.ligo:alexandresebastien.goettel on Aug 30, 2023, 08:53

Hi Colm, thanks! I'm using the 2.1.1 release Bilby version (26c84edf84fec6f3f3eb995d9f06257ad4eef180). "pip show dynesty" outputs Version 2.0.1.

bilby-bot commented 1 year ago

In GitLab by @git.ligo:colm.talbot on Aug 30, 2023, 15:00

Thanks, please can you paste more of the log, I think this part of the trace is not informative, it looks like the job is trying to continue after being terminated.

Also what are the sampler kwargs you are using? Also, if there is anything else non-standard in your environment/job please say.

bilby-bot commented 1 year ago

In GitLab by @git.ligo:alexandresebastien.goettel on Aug 31, 2023, 13:30

So I realised that the log itself contains many "checkpoint and exit on 77" followed by restarts (condor periodic restart), but other than that looks normal.

Should be worth mentioning that I am not running on an igwn cluster but on a separate machine. I am loading conda from cvmfs though and have installed bilby using "python setup.py install" (originally so I could modify some bilby code, but this install is clean out of master).

Here are my sampler settings: {'nlive': 2048, 'naccept': 60, 'check_point_plot': True, 'check_point_delta_t': 1800, 'print_method': 'interval-60', 'sample': 'act-walk', 'bound': 'live', 'maxmcmc': 5000, 'nact': 4, 'verbose': True, 'npool': 17}