Closed fcotizelati closed 8 months ago
Thank you for reporting this.
In the log I see number of live points vary between 1 and inf, most (190/359 iterations) have 1
The inf makes me suspicious. I think that this line https://github.com/JohannesBuchner/UltraNest/blob/master/ultranest/integrator.py#L1722 computes inf, perhaps the log is evaluated for zero? The subsequent line tries to catch invalid computations, but +inf is not marked.
This would occur when widthratio = 0
, which occurs in the line before when logweights[1:,0] - logweights[:-1,0] is zero.
This occurs when all posterior points have equal weight. Probably because the run was such that the likelihood returned always the same number. This can occur when no data points are analysed, or a very, very large portion of the prior is marked with a special invalid likelihood number (e.g., -1e300 in BXA when the Fit.statistic is not a finite number).
Btw, you can circumvent this with the max_num_improvement_loops=0
argument to run().
I think a possible solution could be
to change
nlive[~(nlive > 1)] = 1
to
nlive[~np.logical_and(nlive > 1, np.isfinite(nlive))] = 1
I guess the reason you see this on ARM but not Intel is that np.array([inf]).astype(int)
gives different values?
Thank you for your prompt reply.
Changing the line in integrator.py to nlive[~np.logical_and(nlive > 1, np.isfinite(nlive))] = 1
while still retaining max_num_improvement_loops=-1
in the run() argument worked for me!
Now I will evaluate the robustness of the result, but in the meantime, it's excellent that the sampling procedure did not halt.
I've just noticed that on the Intel architecture, infinity converts to -9223372036854775808, while on the ARM architecture it converts to 9223372036854775807. These are the minimum and maximum values for a 64-bit signed integer (int64).
So I guess the CPU architecture might have different default behaviors for converting floating-point numbers to integers, especially for special floating-point values like infinity.
I released ultranest 4.1.7, please test it and let me know if it solves this issue for you.
It does, thank you. And even in those initially problematic cases, the analysis returns definitely reasonable corner plots for bxa and posterior distributions for the excess variance for bexvar
Summary: I've encountered an issue while running Bayesian inference on an ARM architecture where the sampler attempts to allocate an unrealistically high number of live points (9223372036854775407, nearly the maximum value for a 64-bit integer), leading to an overflow error and halting the process. This issue does not occur on an Intel architecture under identical conditions, suggesting a potential bug in UltraNest's handling of live points specifically on ARM architectures.
Environment details: UltraNest Version: 4.1.6 Python Version: 3.11 System: Darwin (release 23.3.0) Version: Darwin Kernel Version 23.3.0: Wed Dec 20 21:31:10 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6031 Machine: arm64
Attachments: I've attached the debug logs obtained through BXASolver for both Intel ('debug_intel.log') and ARM ('debug_ARM.log') architectures.
I would appreciate any guidance on resolving this issue or any temporary workarounds that might exist. Thank you for your help and for maintaining UltraNest.
debug_intel.log debug_ARM.log