Open shiningsurya opened 9 months ago
If you can reproduce, please add a print before to see what saved_logwt_bs
contains and how large it is before this line:
recv_saved_logwt_bs = mpi_comm.gather(saved_logwt_bs, root=0)
in file "/u/sbethapudi/.local/lib/python3.8/site-packages/ultranest/netiter.py", line 897, in combine_results
This is an error I have not seen before.
https://github.com/mpi4py/mpi4py/issues/23 suggests you may have crossed a 2GB threshold that your MPI does not support. I guess this translates into a limit on number of live points x number of iterations, the latter is increased with the improvement loops.
Description
(let me preface by saying i love this piece of code)
i am solving a 6D fitting problem using
ultranest
. Five are angles hence are wrapped parameters. One is a simple parameter. i am running it on hpc with 480 tasks using MPI.it crashes after it has converged to the ML point.
This has happened during multiple runs with the same error. Looking through the traceback, it is failing in the gather step.
This happened after the iteration has completed. In my code, after i run
sampler.run
, i runstore_tree,print_results,plot_corner
.What I Did
these are my parameters for
run
.