scipp / essreflectometry

Reflectometry data reduction for the European Spallation Source
https://scipp.github.io/essreflectometry/
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

feat: add basic stitching procedure #75

Closed jokasimr closed 2 weeks ago

jokasimr commented 1 month ago

Fixes #73

jokasimr commented 1 month ago

Before stitching: reflectivity

After stitching: reflectivity-stich

After combining: combined-curves

jokasimr commented 1 month ago

Not sure why the docs build fails. It's something to do with the math-text in axis labels:

File ~/work/essreflectometry/essreflectometry/.tox/docs/lib/python3.10/site-packages/matplotlib/_mathtext.py:2173, in Parser.parse(self, s, fonts_object, fontsize, dpi)
   2170     result = self._expression.parseString(s)
   2171 except ParseBaseException as err:
   2172     # explain becomes a plain method on pyparsing 3 (err.explain(0)).
-> 2173     raise ValueError("\n" + ParseException.explain(err, 0)) from None
   2174 self._state_stack = []
   2175 self._in_subscript_or_superscript = False

ValueError: 
$\mathdefault{10^{-2}}$
^
ParseException: Expected end of text, found '$'  (at char 0), (line:1, col:1)

It almost seems non-deterministic because it failed earlier, then did not fail, then failed again, and I can't see a reason why.

Have you seen this before @nvaytet ?

jokasimr commented 1 month ago

Could you copy/paste in the PR description the procedure or algorithm that was implemented? It would make it easier in the future to remember what was done in the PR without having to look at the diff ;-)

At a high level the idea is to assume the reflectivity values in the curves we have are normal distributed around the true reflectivity scaled by some unknown factors that depend on what curve the reflectivity value belongs to. This lets us set up a minimization problem for the scale parameters maximizing the likelihood of the data. What complicates things is that the cost function of the minimization problem involves the true reflectivity. To get around that the true reflectivity is replaced by an estimate obtained by assuming the current scale parameter estimates are correct.

Pseudo code:

def likelihood(scale_parameters_estimate):
    R = estimate_R(scale_parameters_estimate)
    return likelihood_given_true_reflectivity(R, scale_parameters_estimate)

scale_parameters = minimize(likelihood, initial_guess=(1, 1, 1...))

Description

We have $N$ Q-points and $C$ reflectivity curves. For the $c$:th curve we have estimates $R_{nc}$ of the true reflectivity $R_n^{\ast}$ at the $n$:th point. Model the relationship between the estimates and the true values like

    s_c (R_{nc} + \sigma_{nc} \epsilon)  = R_n^{\ast}, \quad \epsilon \in \mathcal{N}

where $sc$ are unknown scale factors, and $\sigma{nc}$ is the standard deviation of the reflectivity estimate at that point, $\epsilon$ is the error and $\mathcal{N}$ is the normal distribution.

Estimating $s_{c}$

Assuming a flat prior probability of the scale factors, we obtain the maximum likelihood estimate

(\hat{s}_1,\ldots, \hat{s}_C) = \mathrm{Argmax}_{s_1,\ldots,s_C} \ P(R_{11}, \ldots, R_{NC}| s_{1},\ldots,s_{C} )
\implies
(\hat{s}_1,\ldots, \hat{s}_C) = \mathrm{Argmin}_{s_1,\ldots,s_C} \ 
\sum_{n,c} \frac{(R_n^{\ast} - s_c R_{nc})^2}{s_{c}^2 \sigma_{nc}^2}.

In practice we do not know the values of $R_n^{\ast}$, so we can not compute the cost function in the minimization problem. We could include it among the parameters to minimize for, but that would increase the number of parameters by a lot (because $N>>C$).

Another approach is to estimate $R_n^{\ast}$ while assuming some fixed values for the scale parameters. Then using the estimates $\hat{R}_n^{\ast}$ instead of $R_n^{\ast}$ in the cost function above we can evaluate the cost and minimize for $s_c$.

Estimating $R_n^{\ast}$ for known $s_c$

$R_n^{\ast}$ is estimated the same way the scaling parameters were estimated:

\hat{R}_i^{\ast} = \mathrm{Argmin}_{R_i^{\ast}} \ 
\sum_{n,c} \frac{(R_n^{\ast} - s_c R_{nc})^2}{s_{c}^2 \sigma_{nc}^2}
\implies
0 = \sum_{c} \frac{(\hat{R}_i^{\ast} - s_c R_{nc})}{s_{c}^2 \sigma_{nc}^2}
\iff
 \hat{R}^*_i= \frac{1}{\sum_c \frac{1}{s_{c}^2 \sigma_{nc}^2}} \sum_{c} \frac{s_{c} R_{nc}}{s_{c}^2 \sigma_{nc}^2}

This last expression is just an average of $sc R{nc}$ weighted by the variances.

jokasimr commented 2 weeks ago

Have to check if it appears in the docs