NNPDF / nnusf

An open source machine learning framework that provides predictions for all-energy neutrino structure functions.
https://nnpdf.github.io/nnusf/
GNU General Public License v3.0
0 stars 0 forks source link

Treat yadism matching data as level-2 closure test data #56

Closed RoyStegeman closed 2 years ago

RoyStegeman commented 2 years ago

This accounts for the fact that unlike experimental data, matching data doesn't contain sampling fluctuations. With this implemented the spread of both matching and experimental data is of similar scale/order.

Resolves #55

Radonirinaunimi commented 2 years ago

@RoyStegeman Thanks for this! I will run a fit a bit later to check how much this would change the fit.

Radonirinaunimi commented 2 years ago

The fit results are available here https://data.nnpdf.science/NNUSF/reports/addnoise-matching-1c526f9-221010/output/

PS: There are now new entries in the summary table expr: $\chi^2{\rm exp}$ for the real data, and expt: $\chi^2{\rm exp}$ for the combined real and matching data.

RoyStegeman commented 2 years ago

Thanks @Radonirinaunimi. I get the impression that we're underfitting a bit (at least in some region). 1) because the <chi2>/N of the matching data is still so small, suggesting that the fit is too smooth and therefore the data unceratinties are not properly propagated to the fit and 2) because expr chi2 is rather much larger than 1 suggesting the data is not properly reproduced.

In particular point 2 is not a super strong argument since I don't know what we can reasonable obtain, but still it may be interesting to see if a more aggressive optimization could improve things. Do you maybe have plots of the SF predictions? Those might give us some intuition as to whether we are indeed underfitting.

Radonirinaunimi commented 2 years ago

My guess on what might be happening is that the fit is more biased towards the Yadism datasets, that is if one was to add weights to the real data the $\chi^2_{\rm real}$ would improve.

We do indeed have plots of the SF predictions, I will post them in the slack, but the plots look very similar to before.

RoyStegeman commented 2 years ago

My guess on what might be happening is that the fit is more biased towards the Yadism datasets, that is if one was to add weights to the real data the

This could also be the case, though while the exp chi2 of the matching is much better than that of the real data, this is not the case (at least to the same extend) for the training chi2. I think to answer the question of whether the fit is biased towards yadsim data, the training chi2 is more relevant than the exp chi2.

We do indeed have plots of the SF predictions, I will post them in the slack, but the plots look very similar to before

Okay perfect. Curious to see what they look like. P.S. do you think we should start including SF replica plots as well?

Radonirinaunimi commented 2 years ago

This could also be the case, though while the exp chi2 of the matching is much better than that of the real data, this is not the case (at least to the same extend) for the training chi2. I think to answer the question of whether the fit is biased towards yadsim data, the training chi2 is more relevant than the exp chi2.

True! This then indicates that the fluctuation is somehow the main problem here.

RoyStegeman commented 2 years ago

True! This then indicates that the fluctuation is somehow the main problem here.

Exactly, that's why I would be interested to know if we are underfitting or not. As said before, in principle underfitting could explain the both poor chi2 of the experimental data (for obvious reasons), and on the yadism side, a lack of fluctuations related to the fitting of the pseudodata replicas might also explain the low chi2 of the central data.

Just a hypothesis at this stage of course, but I think it's plausible.

Radonirinaunimi commented 2 years ago

True! This then indicates that the fluctuation is somehow the main problem here.

Exactly, that's why I would be interested to know if we are underfitting or not. As said before, in principle underfitting could explain the both poor chi2 of the experimental data (for obvious reasons), and on the yadism side, a lack of fluctuations related to the fitting of the pseudodata replicas might also explain the low chi2 of the central data.

Just a hypothesis at this stage of course, but I think it's plausible.

I will explore this and will be running various fits, so far also we don't have any other plausible causes and solutions.

Radonirinaunimi commented 2 years ago

https://github.com/NNPDF/nnusf/blob/5db6400803e237b6a4f0a439537d378a1af2d32a/src/nnusf/sffit/load_data.py#L59-L66

Thanks! This exactly should now treat the pseudodata in the same footing as the real datasets. I will run a fit and check (unless our cluster is again polluted by the ATLAS guy...).

Radonirinaunimi commented 2 years ago

In 5b3bd29, I just moved the adding of the L1 level noise to the data.loader module for the easy of computing the $\chi^{2, \rm match}_{\rm exp}$ later as we'd now need to compute the $\chi^2$ wrt L1-fluctuated data.

RoyStegeman commented 2 years ago

If we're confident this is what we want to do, I guess this can be merged?

Radonirinaunimi commented 2 years ago

If we're confident this is what we want to do, I guess this can be merged?

With the couple of replicas already done (the fit is not fully complete yet), the results are as what we'd expect $\chi^{2, \rm match}_{\rm exp} \sim 1$. So, yes, this is exactly what we wanted to do.

RoyStegeman commented 2 years ago

Okay good. Well, my reason for merging is that from a purely methodological point of view, this is what we agreed to do. Whether we get chi2~1 or not. If it turns out that there are other problems, and this does not produce the desired results, we can address those problems in a separate PR (and keep the discussions organized).

Radonirinaunimi commented 2 years ago

Finally, here is the result of the fit https://data.nnpdf.science/NNUSF/reports/addl1noise-matching-e774de7-221018/output/ . The results are now as we'd expect. On the data comparison plots, the Yadism datapoints are now the fluctuated ones.

Yes, I agree that we should merge this now, any minor change will be added in different places.