Closed Radonirinaunimi closed 1 year ago
Why not just to [-1,1]?
I momentarily lost sight that this should be merged.
Why not just to [-1,1]?
Since now we use linear scaling for $Q^2$ and $A$, we can indeed try to re-scale everything between $[-1, 1]$. I can quickly run a fit to try this out.
Ok thanks, I'll have a look.
This is the report for a scaling between [-1, 1] https://data.nnpdf.science/NNUSF/reports/221027-001/output/, to be compared with [0, 1] https://data.nnpdf.science/NNUSF/reports/221025-scaling-001/output/. The overall $\chi^2$ are (slightly) deteriorated.
Hmm, I didn't really expect it to change anything. Rather the reason I prefer [-1,1] is that it makes the input symmetric wrt the activation function, so it simply seems like it should be the default way to do things.
Anyway, if I can guess about a possible explanation, I would think that the fact that with the simple linear scaling the Q points are more dense towards small values of the input to the NN, so if we scale to [0,1] this corresponds to having most points around the center of the activation function where the gradient is the largest.
After a simple "by eye" comparison of the data-prediction plots, it seems to me that indeed the small-Q points are more significantly impacted than the large-Q points, so that would support the above hypothesis.
Anyway, if I can guess about a possible explanation, I would think that the fact that with the simple linear scaling the Q points are more dense towards small values of the input to the NN, so if we scale to [0,1] this corresponds to having most points around the center of the activation function where the gradient is the largest.
Yes, this is exactly the case. If you recall the scaling plot, this is exactly what we saw in the Q2 distributions.
Yes, I know what the distribution of Q points looks like. Whether that is indeed the cause of the deterioration is a different story. It also seems as if NUTEV F2 might drive the deterioration
Yes, I know what the distribution of Q points looks like. Whether that is indeed the cause of the deterioration is a different story.
True! For the time being, I'd propose to reset this branch to 965724f and merge it, and we can investigate this in a new PR (?). The reason being that main is now really behind in terms of reports and deliveries we produce.
So I guess this should just be merged?
So I guess this should just be merged?
Yes, we can now merge this.
Simplify the scaling of the $Q^2$ and $A$ inputs by simply re-mapping them between $0$ and $1$.