SasView / sasmodels

Package for calculation of small angle scattering models using OpenCL.
BSD 3-Clause "New" or "Revised" License
15 stars 27 forks source link

DirectModel returns wrong array size #542

Open lucas-wilkins opened 1 year ago

lucas-wilkins commented 1 year ago

DirectModel returns data with the wrong size.

Here's a minimal example that should reproduce the bug...

import numpy as np

from sasmodels.core import load_model_info, build_model
from sasmodels.data import empty_data2D
from sasmodels.direct_model import DirectModel

model_info = load_model_info("parallelepiped")
model = build_model(model_info)
q = np.linspace(-0.5, 0.5, 129)
input_data = empty_data2D(q, q)

calculator = DirectModel(input_data, model)

output_data = calculator(
    theta=0.0,
    theta_pd=0.0,
    theta_pd_type="gaussian",
    theta_pd_n=1,
    phi=0.0,
    phi_pd=0.0,
    phi_pd_type="gaussian",
    phi_pd_n=1,
    psi=0.0,
    psi_pd=0.0,
    psi_pd_type="gaussian",
    psi_pd_n=1,
    a=1.0,
    b=0.4,
    c=0.2,
    background=0.1)

print(input_data.data.shape) # 16641 = 129x129
print(output_data.shape)      # 16640 = 129x129 - 1
lucas-wilkins commented 1 year ago

Seems it is for odd numbers > 1

lucas-wilkins commented 1 year ago

OK, its that (0,0) gets excluded silently

pkienzle commented 1 year ago

More generally, the model is only computed on the non-masked inputs, with an implicit masking on 2-D data in a circle of radius $10^{-16}$ around q=(0,0).

I do not remember why we are masking (0,0). This is a place where many models diverge but the calculator should be assigning $\infty$ or NaN in those situations. It may be an artifact from an earlier time when many models failed for (0,0), however the test suite now does smoke tests for (0,0) on each model so it may no longer be necessary to mask it.

The direct model object carries an index attribute containing the combined mask. For calculating the negative log likelihood then we should use result to compare with (I(q), ΔI(q)) at index.

TODO: Modify bumps_models.Experiment to use self.index when calculating residuals and plotting.

We could change the interface so that results is the same shape as inputs with NaN for the masked values. Since this is a breaking change we ought to include a flag to select the new behaviour. Or we could pretend that no third parties are using the calculator with masked data and silently change it.

My inclination is that we need to deal with masked inputs anyway (VSANS overlap region needs to be excluded) so leave things as they are. We don't gain much usability by doing the change and we have a small performance penalty compared to returning the unstructured vector.

TODO: Make sure whatever behaviour we choose is documented.

pkienzle commented 11 months ago

Here is code to plot the returned value from the above example:

from copy import copy
from sasmodels.data import plot_theory

resid = (output_data - calculator.Iq)/calculator.dIq
data = copy(calculator._data)
data.mask = ~calculator.index
plot_theory(data, output_data, resid)

The current bumps_model.Experiment.plot doesn't work for this data without this fix, so we should either use the mask provided by the data and not check for bad values, or we should make the stored data object public with the updated mask.