Open smrenna opened 1 year ago
If, in prediction2YODA, I make the following change, the histograms all line up:
with open(fvals) as f:
import json
rd = json.load(f)
xmin = np.array(rd["__xmin"])
xmax = np.array(rd["__xmax"])
keys = list( rd.keys() )
hids=np.array([b.split("#")[0] for b in keys])
However, the approximation file does not give a good representation of the data. That makes me wonder if there isn't also some mismatch in app-build. However, it is not clear to me how the polynomial fit depends upon "x" values.
The problem is that vals = app.AppSet(fvals)
sorts the approximations with app.tools.sorted_nicely
after reading from the fvals
json, but the __xmin/__xmax
from the json are loaded without applying the same sorting.
The fix that works for version 1.0.7 from pip is to apply the same permutation to the bin edges:
with open(fvals) as f:
import json
rd = json.load(f)
xmin = np.array(rd["__xmin"])
xmax = np.array(rd["__xmax"])
ids_nicesort = vals._binids
ids_likefile = [x for x in rd.keys() if not x.startswith("__")]
likefile2nicesort = [ids_likefile.index(x) for x in ids_nicesort]
xmin = xmin[likefile2nicesort]
xmax = xmax[likefile2nicesort]
Hi, @ojinoo are you saying there is a version 1.0.7 (not in this repository, because I don't see that tag) that already has the fix in it? I had found a similar solution, but had not committed anything yet. However, if there is another version out there, it would be good to synchronize it with the gitlab. thanks.
@ojinoo This was my solution:
hids=np.array([b.split("#")[0] for b in vals._binids])
- hnames = sorted(set(hids))
+# the following will remove duplicates but preserve the order
+ hnames=list(dict.fromkeys(hids))
observables = sorted([x for x in set(app.io.readObs(wfile)) if x in hnames]) if wfile is not None else hnames
with open(fvals) as f:
@@ -440,14 +441,21 @@ def prediction2YODA(fvals, Peval, fout="predictions.yoda", ferrs=None, wfile=Non
rd = json.load(f)
xmin = np.array(rd["__xmin"])
xmax = np.array(rd["__xmax"])
+# The order of the keys in the JSON read are not set
+ analysisIds=np.array([b.split("#")[0] for b in list(rd.keys())])
DX = (xmax-xmin)*0.5
X = xmin + DX
Y2D = []
+# X and Y are not guaranteed to be in the same order
import yoda
+ start = 0
for obs in observables:
- idx = np.where(hids==obs)
- P2D = [yoda.Point2D(x,y,dx,dy) for x,y,dx,dy in zip(X[idx], Y[idx], DX[idx], dY[idx])]
+ idx = np.where(analysisIds==obs)
+ strand = np.size(idx)
+ jdx = np.arange(start,start+strand)
+ start = start + strand
+ P2D = [yoda.Point2D(x,y,dx,dy) for x,y,dx,dy in zip(X[idx], Y[jdx], DX[idx], dY[jdx])]
Y2D.append(yoda.Scatter2D(P2D, obs, obs))
yoda.write(Y2D, fout)
but yours would be preferable if it does the same thing in less code.
Hi @smrenna, the fix is not in 1.0.7 (https://pypi.org/project/pyapprentice/1.0.7/, the newest version is https://pypi.org/project/pyapprentice/1.1.0/, but that is also buggy for me). It is my own solution.
I have now forked this github version and I'm also using it for my work, so that I can add bugfixes when I get inconsistencies. My fork is at https://github.com/ojinoo/apprentice.
Tuning/minimization appears to work, but the prediction yoda file has different properties than the reference data.
I am reproducing results from Julia Yarba who is using root-extracted yoda files. I thought the issue was with the structure of these yoda files (the naming), but changing that did not make a difference.
I can share the data and mc runs (it is not large), just say where.
Here is an example of the comparisons.