JohannesBuchner / UltraNest

Fit and compare complex models reliably and rapidly. Advanced nested sampling.
https://johannesbuchner.github.io/UltraNest/
Other
142 stars 30 forks source link

Feature request/bug: returning float32 from log-likelihood fn with PopulationSliceSampler #106

Closed lwelzel closed 10 months ago

lwelzel commented 11 months ago

Description

I use the vectorized ReactiveNestedSampler with a PopulationSliceSampler(generate_direction = ultranest.popstepsampler.generate_region_oriented_direction). My log-likelihood function uses pytorch functions with float32 precision (for speed and memory usage) and has a numpy wrapper.

As long as I cast the output to float 64 in my wrapper (see pseudo code below) everything works as expected, however if I return float 32 values cython evolve and evolve_update functions in stepfuncs fail, see traceback below. It would be convenient to be able to return values with arbitrary precision, especially for vectorized likelihood functions that use image data. When using other samplers (e.g. SliceSampler), no error is raised when returning float 32.

Traceback (most recent call last):
  File "/home/.../ultranest_fitting.py", line 235, in <module>
    sampler.run(
  File "/home/.../site-packages/ultranest/integrator.py", line 2311, in run
    for result in self.run_iter(
  File "/home/.../site-packages/ultranest/integrator.py", line 2576, in run_iter
    u, p, L = self._create_point(Lmin=Lmin, ndraw=ndraw, active_u=active_u, active_values=active_values)
  File "/home/.../site-packages/ultranest/integrator.py", line 1834, in _create_point
    u, v, logl, nc = self.stepsampler.__next__(
  File "/home/.../site-packages/ultranest/popstepsampler.py", line 299, in __next__
    nc = self.advance(transform, loglike, Lmin)
  File "/home/.../site-packages/ultranest/popstepsampler.py", line 202, in advance
    ) = evolve(transform, loglike, Lmin, *args, log=self.log)
  File "ultranest/stepfuncs.pyx", line 263, in ultranest.stepfuncs.evolve
  File "ultranest/stepfuncs.pyx", line 87, in ultranest.stepfuncs.evolve_update
ValueError: Buffer dtype mismatch, expected 'float_t' but got 'float'

What I Did

# pseudo code:

torch.set_default_device("cuda:0")  # log-likelihood functions evaluated on GPU
torch.set_default_tensor_type(torch.FloatTensor)  # float32 set explicitly

# ...

class LogLikelihood(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, theta): # vectorized as (N_batch, N_para)
        log_likelihood = torch.bunch_of_image_transforms(theta)
        return log_likelihood # (N_batch)

log_likelihood = LogLikelihood()  # calling log_likelihood(theta) calls the forward method of LogLikelihood instance

def wrapper_log_liklihood(theta):
    theta = torch.tensor(theta.astype("float32")) # cast to float 32
    out = log_likelihood(theta)
    return out.cpu().numpy().astype("float64") # cast back to float 64

sampler = ultranest.ReactiveNestedSample
    param_names, wrapper_log_liklihood, prior_transform,
    vectorized=True,
    resume="overwrite",
    # ...
)

sampler.stepsampler = PopulationSliceSampler(
    popsize=128,
    nsteps=int(2 * num_parameters), 
    generate_direction=ultranest.popstepsampler.generate_region_oriented_direction,
)

sampler.run(
    min_num_live_points=2048,
    dlogz=0.5,  
    min_ess=1024,  
    update_interval_volume_fraction=0.4, 
    max_num_improvement_loops=16, 
)

Thanks for the great package and many extra thanks for making vectorized likelihood function evaluations so easy!

JohannesBuchner commented 11 months ago

I think the cython code needs to have specific types on compile time (which is at install). So we would need two versions of each function, one 32, one 64 bit.

lwelzel commented 11 months ago

I think it might be possible to do this with a fused floating point type (https://cython.readthedocs.io/en/latest/src/userguide/fusedtypes.html) and evaluation at runtime (potentially one added if statement for the specific type) but I am not familiar with cython. Looking at the evolve_update function I'm not sure if there are operations which are different between float 32 and float 64.

JohannesBuchner commented 11 months ago

Another option would be to add an if somewhere here https://github.com/JohannesBuchner/UltraNest/blob/master/ultranest/integrator.py#L483 to sense that the dtype is not a float.

One could either add an assert to make the user aware they should convert to float, or try to add a wrapper call that converts.

Not sure it makes sense for ultranest to add more automagic for convenience in this case. A clear error and guidance for the user what to do may do the job.

lwelzel commented 11 months ago

In my case a specific error would have been sufficient, and I struggle to see where this would be an issue since the log-likelihoods need to go to the CPU anyway and I'd assume the conversion is probably quite quick, even with large population sizes (no noticeable difference in my case between popsize=128 and popsize=2048).

JohannesBuchner commented 11 months ago

OK, can you test whether

        assert p.dtype == float, ("prior transform function should return floats, not %s" % p.dtype)
        assert logl.dtype == float, ("loglikelihood function should return floats, not %s" % logl.dtype)

passes for correct functions and raises an error for wrong functions?

lwelzel commented 11 months ago

Yes this works and if I return float32 type arrays it raises the error:

AssertionError: loglikelihood function should return floats, not float32

so that the issue should be clear wrt the dtypes. It still might be worth it to make the required 64 bit float explicit in the message. I also added this here for the ReactiveNestedSampler, and I assume this needs to be added for any other sampler not inheriting from either.

Thanks for the quick response, and again for the great package (and PyMultiNest)!

JohannesBuchner commented 10 months ago

I am closing this for now, please reopen if there is more that should be done.