Errors propagation - Githubissues

rugfri commented 1 year ago

Somehow related to #94, I was wondering if you think it could be possible to implement the error propagation (on fitting parameters, experimental uncertainties if present/enabled) to the calculated quantities like dielectric function, refractive index, and such. I recently started using https://uncertainties-python-package.readthedocs.io/en/latest/, https://github.com/lebigot/uncertainties/ which may be useful, though I am not sure how far complex numbers/arrays are supported.

Thanks

domna commented 1 year ago

Hey @rugfri ,

it certainly is possible and @MarJMue and I discussed it several times already and we wanted to bring this feature in when we refactor the internal data model. Generally, it is not straightforward to obtain uncertainties from a fit. For lmfit the approach would be to scale the residuals by σ, but I'm not sure if lmfit takes this into account for its covariance matrix / stderr values or just to adapt the actual fit. Lmfit does also support F-testing: https://lmfit.github.io/lmfit-py/confidence.html which may be a bit more robust here. If it's important/urgent for you we could try to build an example together to do a manual fit and try to calculate confidence intervals from it.

rugfri commented 1 year ago

Thanks @domna. With σ you refer to the experimental errors on Psi and Delta? It seems to me that a direct call to minimize of lmfit allows to have things a bit more under control since you provide the residuals. Anyway, yes I would be interested in pushing this a bit forward. Thanks, let me know how is best to proceed. I had one example script loaded in Nomad, maybe I can try to resume that and we build on that?

domna commented 1 year ago

Thanks @domna. With σ you refer to the experimental errors on Psi and Delta?

yes

It seems to me that a direct call to minimize of lmfit allows to have things a bit more under control since you provide the residuals. Anyway, yes I would be interested in pushing this a bit forward. Thanks, let me know how is best to proceed. I had one example script loaded in Nomad, maybe I can try to resume that and we build on that?

Sounds good. I see you already added a notebook there. I'll have a more detailed look at it soon

domna commented 1 year ago

Hey @rugfri

I added a scaling with the residuals in the notebook. However, if I do use this as weights the fit does not properly converge to the data. If you want you can have a look, maybe it's just a small mistake I did.

I also removed the 680nm point as the error there was around 1000 (probably some device malfunction?) and tried to use normalized weights.

rugfri commented 1 year ago

Thanks @domna,

Thanks. I think in the derivatives there is a "-" missing, since the derivative of exp(-jx) is -j*exp(-jx), which squared is -exp(-2ix). This would mean that the partial derivative of \rho respect to \Delta has a "-" in front. This however makes even more problems with the imaginary part.

Otherwise looks everything correct. I will check again tomorrow.

I tried derive an analytical expression for the error where real and imaginary parts are separated but I couldn't get anything usable so far.

Yes, I did also notice the large error on some points. The instrument seems to have some problems, since it is systematic.

domna commented 1 year ago

Thanks @domna,

Thanks. I think in the derivatives there is a "-" missing, since the derivative of exp(-jx) is -j*exp(-jx), which squared is -exp(-2ix). This would mean that the partial derivative of \rho respect to \Delta has a "-" in front. This however makes even more problems with the imaginary part.

You are completely right. The problem now is that we are using numpys "normal" sqrt which does not generate imaginary numbers for negative inputs. I changed it to numpy.lib.scimath.sqrt.

However, the fit is still not converging to the measured data 🤔

Otherwise looks everything correct. I will check again tomorrow.

I tried derive an analytical expression for the error where real and imaginary parts are separated but I couldn't get anything usable so far.

Yes, I did also notice the large error on some points. The instrument seems to have some problems, since it is systematic.

Edit: Ok I found the problem. We need to scale the weights by the variance and not the stderr. Hence, we need to square the errors before we scale the residuals. I added this in the notebook and the fit converges now. I think we also need to use leastsq's exclusively for this as L-BFGS-B does use a gradient based approach which cannot really deal with weights (it also does not converge when I use the weights).

rugfri commented 1 year ago

Thanks!! I am not sure I see the change, maybe this err_rho = data_aoi['err_rho'].to_numpy() ** 2 ?

anyway the fit message and other attributes confirm. The plot too :-). Though chi2 and redchi are quite higher than I expected.

Now, my original question was also about the best way to error propagation to the dielectric function and the refractive index. I assume you would go along the same lines as for rho, correct? Thanks, Ruggero

domna commented 1 year ago

Thanks!! I am not sure I see the change, maybe this err_rho = data_aoi['err_rho'].to_numpy() ** 2 ?

Exactly

anyway the fit message and other attributes confirm. The plot too :-). Though chi2 and redchi are quite higher than I expected.

Yeah I don't know how reliable they are, i.e., whether they scale with the absolute magnitude of the errors. If they do they would be strongly influenced by the normalization we currently do (and I think it did not work w/o it). I think for really reliable results one has to calculate the confidence intervals according to https://lmfit.github.io/lmfit-py/confidence.html

Now, my original question was also about the best way to error propagation to the dielectric function and the refractive index. I assume you would go along the same lines as for rho, correct? Thanks, Ruggero

I'm not sure how to propagate these errors for multiple parameters. For one parameter I would just do param + error and param - error and use the difference of the resulting values w.r.t. to the param values. For N parameters we would need calculate 2^N additional psi/delta fits and extract the largest, smallest or average error from it (not sure what the best choice would be). I think this should be feasible for your data (4 varied parameters, thus 16 evals), but I think to be productively usable we might need some shortcuts here, but I need to read up on the theory.

domna commented 1 year ago

I found a paper which deals with this issue: https://doi.org/10.1016/j.optcom.2018.07.025 I did not read it yet but I think it's exactly what we want.

rugfri commented 1 year ago

thanks! Yes, I did not finish to read it yet but it seems to be the one. Let me know how you want to proceed, if I can do anything.

domna commented 1 year ago

Hey,

I think I extracted the main parts and changes for us (from Is and Ic to rho). I think from this I could do an implementation. However, I contacted the author of the paper and asked if he is willing to share his implementation and reference data. I think this will be extremely helpful. So I'd wait until next week whether I hear something back (I think france has also a strong holiday season so it might take a while....).

domna commented 1 year ago

Hey,

I did not get any answer, so we might just have to do an implementation ourselves. I'll compile my notes here or in our testing notebook soonish and then we can start implementing something (I'll be, however, going in holidays from mid next week and I'll probably not be able to an implementation before then. But I'll at least try to write everything down before then).

domna commented 1 year ago

Hey @rugfri,

this is what I extracted from the paper:

$\Delta nk = \sum{k'=0}^K \left| - J_n(\lambdak) \cdot \text{Cov} \cdot C(\lambda{k'})\right| \Delta \Re(\rho(\lambda_{k'}))$

where $J_n(\lambda_k)$ is the Jacobian of the optical model function w.r.t. to n, i.e., in other words it is the Jacobian of the function for the layer we want to extract the dispersion for. $\text{Cov}$ is the covariance matrix from the fit. $k$ and $k'$ are wavelength indices. $C$ is the pertubation vector in chi-squared with elements

$\frac{\partial \chi^2}{\partial pi \partial \Re{\rho(\lambda{k'})}} = - 2/K \frac{\partial \Re(\rho)}{\partial pi} (\lambda{k'})$

, where $K$ is the total number of (wavelength) points. The r.h.s. is the partial derivative of the total model function for parameter $p_i$. So it is basically the Jacobian of the total model function but only for the parameters relevant in the Jacobian of the optical model function (i.e., the Jacobian above).

The jacobians we should just be able to calcualte with numdifftools, the covariance matrix we get from lmfit and calculating C should be straightforward (i.e. calculation jacobi and selecting the right elements), too.

I think this should be pretty straightforward to implement like this. Let me know what you think and if you can spot any mistakes here.

Edit: One thing which we also need to do is to reduce the covariance matrix from lmfit to the appropriate dimension, i.e., only using the parameter used by the dispersion model. Or we just add zeros to the Jacobian $J_n$ for all additional parameters. Then we can keep Cov and C as is.

rugfri commented 1 year ago

Thanks @domna! I think it is correct. Yes, agree that derivatives should come thourgh numdifftools. After all, as far as I know, also lmfit uses numdifftools. Reducing $\mathrm{Cov}$ and $\rm{C}$ seems more efficient, especially if the non-relevant parameters are moved to the last ones.

I am just back from vacations and will double check again from tomorrow. Enjoy your vacations!

domna commented 11 months ago

Hey @rugfri,

I'm back. I would try to give an implementation a shot. Did you implement something already I can build on? I also heard back from the author and he sent me the data from the paper. So we can use it to check our results (but somehow have to convert it to Is and Ic - or we do the fitting directly in Is and Ic, which might be the better choice for the start).

rugfri commented 11 months ago

Hi @domna, hope you had good time! No, sorry I didn't. Good news about Gilliot. Yes, agree, using Is and Ic it's better even though can be more work. Let me know if I can do anything. Maybe I can draft the error prop on Is and Ic along the same lines as for Rho? So we see if we end up with the same results :-)

domna commented 11 months ago

Hi @domna, hope you had good time!

Indeed it was very nice

No, sorry I didn't. Good news about Gilliot. Yes, agree, using Is and Ic it's better even though can be more work.

I already started to add Is/Ic fitting and could fit the data from the paper. I shared a new upload in nomad with you, which contains the data and my fitting, so we can work together there to deduce the errors.

Let me know if I can do anything. Maybe I can draft the error prop on Is and Ic along the same lines as for Rho? So we see if we end up with the same results :-)

This should be straightforward, we just need to calculate the jacobian of Is and Ic in C. As soon as we are happy we simply can replace Is and Ic with rho if we want to do rho based fitting (actually we should get the same errors in n, so its a good check).

rugfri commented 11 months ago

Hi @domna, thanks. It looks correct to me. As I understand you calculated also the Jacobian of the model, one should now reduce the covariance matrix, calculate C, and propagate (first equation above). Correct?

domna commented 11 months ago

Hi @domna, thanks. It looks correct to me. As I understand you calculated also the Jacobian of the model, one should now reduce the covariance matrix, calculate C, and propagate (first equation above). Correct?

Yes, I already do this in the notebook under Error propagation (the calc_sum function is multiplying the jacobians and cov matrix). However, the errors I'm getting are very small (~10^-10), so there seems to be something wrong still, but I did not find the problem, yet.

rugfri commented 11 months ago

Oh, yes, sorry. Nice. It seems all good to me. Do we get the variance or sigma out of calc_sum? (Likely not fixing the issue, though since 10^-10 is still very small). I am not sure, what is the 0.005 factor in the I*_err expressions?

domna commented 11 months ago

Oh, yes, sorry.

No worries :)

Nice. It seems all good to me. Do we get the variance or sigma out of calc_sum? (Likely not fixing the issue, though since 10^-10 is still very small).

From the paper it should be $\Delta n$, so the standard error (sigma) of the quantity.

I am not sure, what is the 0.005 factor in the I*_err expressions?

It is the error of Is/Ic in the experimental data. It's constant for every data point.

I actually think my formula is not completely right. I think all uncertainties have to be taken into account and not only the ones which are parameters of n (because otherwise having a $\Delta n$ from the substrate $\Delta n_s$ makes no sense), so I think I only have to reduce the dimension in one axis of the covariance matrix. Additionally, I currently don't use any errors for the substrate refractive index error in the calculation.

domna commented 9 months ago

Sorry @rugfri I did not have much time to work on this in the last month(s) and it will probably not have it in the next weeks, too (I still try to sit down and find my error in the formula, though). Is this still of interest for you?

rugfri commented 9 months ago

Hi @domna no problem! Yes it is still of interest for me. I have been also taken away from this topic. Had only time to find an old code of mine where I was doing something something similar, but I didn't get much beyond. In case I'll find something useful I will post it here.
In any case, take your time :-) thanks!

PyEllips / pyElli

Errors propagation #150