NNPDF / nnusf

An open source machine learning framework that provides predictions for all-energy neutrino structure functions.
https://nnpdf.github.io/nnusf/
GNU General Public License v3.0
0 stars 0 forks source link

Potential issue with (quoted) Validation chi2 for CDHSW_F2, CHORUS_F2, NUTEV_F2 #44

Closed Radonirinaunimi closed 1 year ago

Radonirinaunimi commented 2 years ago
There seems to be some issues with the (quoted) validation $\chi^2$ values for the CDHSW_F2, CHORUS_F2, and NUTEV_F2 datasets. As an example, a fit including only the NUTEV_F2 (with one single replica) yield the following results: Dataset Epoch REP ID $\rm{N}_{\rm tr}$ $\chi^2_{\rm tr}$ $\rm{N}_{\rm vl}$ $\chi^2_{\rm vl}$ $\rm{N}_{\rm tot}$ $\chi^2_{\rm exp}$
NUTEV_F2 2316 35 58 1.847 20 922.098 78 8.721
The reason why there are $\rm{N}_{\rm tot} = 78$ is because no maximum $Q^{2}$ cut is imposed. The 20 data-points forming the validation set are shown in the table below (with $Q^2_M$ the mapped values). The data vs prediction comparisons are also shown below for all the slices in $x$. $x$ $Q^2_M~([0, 1])$
0.015 0
0.015 0.051
0.015 0.206
0.045 0.051
0.125 0.323
0.125 0.466
0.175 0.115
0.175 0.323
0.175 0.466
0.175 0.609
0.275 0.206
0.275 0.856
0.35 0.466
0.45 0.609
0.45 0.739
0.45 0.856
0.55 0.609
0.55 0.739
0.65 0.466
0.65 0.739
prediction_data_comparison_0 prediction_data_comparison_1
prediction_data_comparison_2 prediction_data_comparison_3
prediction_data_comparison_4 prediction_data_comparison_5
prediction_data_comparison_6 prediction_data_comparison_7
prediction_data_comparison_8 prediction_data_comparison_9
prediction_data_comparison_10 prediction_data_comparison_11

As shown above, there shouldn't be a reason for the validation $\chi^2$ to be large this large given that no prediction is (significantly) far way from the true validation points.

juanrojochacon commented 2 years ago

This seems to be a bug - else how come the chi2_val is so huge?

It may also be useful to plot theory as ratio to data, to better identify this kind of discrepancies. But I don't see how one can get such a poor chi2 since the data in the validation subset agrees rather well with the theory

Radonirinaunimi commented 2 years ago

This is admittedly an odd issue. For a different replica, the results are given below. Notice how the $\chi^2{\rm exp}$ has improved significantly while the $\chi^2{\rm vl}$ is still (slightly) worse.

Dataset Epoch REP ID $\rm{N}_{\rm tr}$ $\chi^2_{\rm tr}$ $\rm{N}_{\rm vl}$ $\chi^2_{\rm vl}$ $\rm{N}_{\rm tot}$ $\chi^2_{\rm exp}$
NUTEV_F2 5559 1 58 0.659 20 11.721 78 1.893
In the same way as before, the 20 data-points forming the validation set are shown in the table below: $x$ $Q^2_M~([0, 1])$
0.015 0
0.015 0.115
0.015 0.323
0.080 0.
0.125 0.115
0.125 0.466
0.125 0.609
0.125 0.739
0.175 0.206
0.175 0.856
0.225 0.466
0.275 0.609
0.275 0.856
0.35 0.466
0.45 0.323
0.45 0.974
0.55 0.609
0.55 0.856
0.55 0.934
0.65 0.974
prediction_data_comparison_0 prediction_data_comparison_1
prediction_data_comparison_2 prediction_data_comparison_3
prediction_data_comparison_4 prediction_data_comparison_5
prediction_data_comparison_6 prediction_data_comparison_7
prediction_data_comparison_8 prediction_data_comparison_9
prediction_data_comparison_10 prediction_data_comparison_11
  1. The issue does not appear to be in the computation of the $\chi^2{\rm vl}$ (nor in how the results are presented in the report) since: (a) this artifact does not concern the other datasets, and (b) some replicas are better with a somehow reasonable values of $\chi^2{\rm vl}$. The problem is that most (over $90$%) of the replicas for these datasets are bad.
  2. Based on the two results above, it indeed seems that even a small discrepancy between data & predictions can lead to a very large value of $\chi^2_{\rm vl}$. The question is: how is this possible?
Radonirinaunimi commented 2 years ago
So the problem was that some datapoints (in the above example, only a single point) were artificially large because the shifts were so large due to a bug that is fixed in #45. Now the results are reasonable (and converge faster): Dataset Epoch REP ID $\rm{N}_{\rm tr}$ $\chi^2_{\rm tr}$ $\rm{N}_{\rm vl}$ $\chi^2_{\rm vl}$ $\rm{N}_{\rm tot}$ $\chi^2_{\rm exp}$
NUTEV_F2 567 35 58 1.171 20 2.7866 78 2.501