reflectometry / refl1d

1-D reflectometry fitting
https://refl1d.readthedocs.io/
Other
16 stars 22 forks source link

Convolve fails with division by zero exception for widely spaced Q points #144

Open hoogerheide opened 2 years ago

hoogerheide commented 2 years ago

Maybe related to https://github.com/reflectometry/refl1d/issues/140:

When using the "interpolation" flag in probe.reflectivity(), the calculation fails when sample_broadening is set (and sometimes when it's zero). There's a close relationship between the failure of the calculation and the spacing of the Q points.

I can't attach a Jupyter notebook so here's the code example showing the failure:

import numpy as np
from refl1d.names import silicon, air, FitProblem, Experiment, NeutronProbe, Slab, Parameter

dLoL = 0.0169
dToT = 0.0234

npoints = 101
Ts = np.linspace(0.18, 6, npoints)
Ls = 5*np.ones_like(Ts)

probe = NeutronProbe(Ts, Ts*dToT, Ls, Ls*dLoL)
probe.sample_broadening = Parameter(name='sb', value=0.0).range(-0.005, 0.020)

sample = Slab(silicon, interface=2) | Slab(air, interface=2)

model = Experiment(probe=probe, sample=sample, dz=0.5, step_interfaces=False)

problem=FitProblem(model)

# trigger error
problem.setp([0])
Qth, Rth = problem.fitness.reflectivity(interpolation=100)

# Note: get different results based on npoints. Widely spaced points have a serious problem with sample_broadening.
# npoints = 101 fails in the range shown here. npoints = 201 fails only for sb < 0. npoints = 1001 doesn't fail.
# npoints = 11 fails for all sample broadening values

for sb in np.arange(-0.005, 0.020, 0.001):
    problem.setp([sb])
    try:
        Qth, Rth = problem.fitness.reflectivity(interpolation=100)
        #plt.loglog(Qth, Rth)
    except:
        print(sb)
bmaranville commented 2 years ago

Right now there is no detailed adjustment made to the Q-basis for the model calculation based on the "interpolation" parameter in problem.reflectivity - if it is necessary to plot additional (resolution-smeared) Q-values between the sample points, we would have to

  1. also guess at the resolution between the points (interpolate that too? What if it is a different shape?)
  2. determine what the spacing of support points should be between the additional display points (more support points, or oversampling, might be needed in order to properly calculate the additional displayed smeared points)

Currently it is interpolating the Q values and dQ values.

hoogerheide commented 2 years ago

I'm not clear on the relationship between "interpolation" and "oversampling" in this context. For example, how does oversampling deal with the question of resolution between the points?

So why does interpolating the Q/dQ values cause this error? Is it trying to access Q = 0?

bmaranville commented 2 years ago

It is not trying to access Q=0. The divide-by-zero is happening (at least for your first test case:)

# trigger error
problem.setp([0])
Qth, Rth = problem.fitness.reflectivity(interpolation=100)

is that _interpolate_Q is extending the Q-range on either end of the probe.calc_Q values, and at low Q the "interpolated" values don't have any corresponding calc_Q values nearby. For example the smallest "interpolated" value of Q is 0.006631848473064208, and Q + 3.5 sigma = 0.006934001370634783, Q - 3.5 sigma = 0.0063296955754936325, but the closest Q value in the calc_Q is 0.007895670532999092, which is essentially infinitely far away as far as roundoff in erf is concerned.

If there are no "support" values within +/- 3.5 sigma then the convolution fails (and it should!). You can't convolve something that isn't there.

It might be that the issue for you is that the extension of Q-values in _interpolate_Q is based on the spacing of probe.Q (data points) and not taking into account the resolution. I have to admit that I don't understand how _interpolate_Q is supposed to work - for instance, why the Q-range is set to be

# Extend the Q-range by 1/2 interval on either side
Q = np.hstack((0.5*(3.*Q[0]-Q[1]), Q, 0.5*(3.*Q[-1]-Q[-2])))

I think maybe we should base the interpolation on calc_Q (and dQ) instead of Q and the stepsize in Q, since the convolution will be done over calc_Q (with width = resolution)

bmaranville commented 2 years ago

Overall we will probably be better off by determining a sufficient basis (density of calc_Q values) to cover all the data Q values, and using that basis (instead of doing interpolation) to show calculation values between the data points. It would mean that you might have noticeably different spacings for your "smooth" curve in different Q-regions, depending on how granular your definition of the basis is (10 segments of different Q-spacings? 3 segments?) but the spacing that you get will be (by design) sufficient to capture the resolution-smeared features throughout.

hoogerheide commented 2 years ago

It sounds like the issue is one of extrapolation. The easiest thing to do would be to extend the Q-range by no more than 3.5 * sigma, or, better yet, not to allow extrapolation at all!

Alternatively, why not allow probe.reflectivity() to optionally accept a custom vector of calc_Q values? This should get around the different spacings. If you need a resolution value, use an interpolated dQ value and fixed dQ if you go off the end.

bmaranville commented 2 years ago

You will still have the issue that your supplied calc_Q values have to be sufficiently closely-spaced to support the interpolated Q values. It's complicated - you're creating a set of display values (based on interpolation of the data Q values) and then you have to create a basis set of calc_Q that can be convolved with the (also interpolated) resolution widths to provide a smooth curve.

hoogerheide commented 2 years ago

I think there are two issues:

  1. The one you just mentioned: this ambiguous relationship between the interpolated points and the calc_Q points.
  2. The edge effects, where it simply doesn't work if the first two Q points are spaced too far apart relative to the resolution.

I would suggest that we try to solve the second at the very least, using one of the suggestions above, because this is a bug. Then we can think about the first one, which is more of a complication than a bug, and can at least for now can be "solved" using ridiculous oversampling.