Closed acampove closed 1 month ago
Hi @acampove,
hep_ml did not change any fields for a while, instead we use sklearn's default serialization (which is pickle).
Differences come from changing sklearn version, in particular 1.2 <> 1.3, see this issue: https://github.com/scikit-learn/scikit-learn/issues/26798
It is a bit surprising that sklearn changed format (they never promise they won't, but also they try to keep this compatibility).
Unfortunately there is no simple way to fix this on hep_ml side, as one anyway needs some persistent format for sklearn trees, which isn't provided.
Hello @arogozhnikov
Thanks for your reply, I confirm that the problem was with the version of scikit-learn. I just created a virtual environment and tried:
Package Version
--------------- -------
dill 0.3.9
hep-ml 0.7.2
joblib 1.4.2
numpy 1.26.4
pandas 2.2.3
pip 24.2
python-dateutil 2.9.0
pytz 2024.1
scikit-learn 1.2.2
scipy 1.14.1
setuptools 75.1.0
six 1.16.0
threadpoolctl 3.5.0
tzdata 2024.2
wheel 0.44.0
and it seems to unpickle it. By the way, dill seems to be also needed, but it's not installed as a requirement of hep_ml
.
dill seems to be also needed
You shouldn't need dill; something in your env overrides pickle with dill (e.g. see in your traceback above that load_pickle somehow calls dill; pickle is system default and wouldn't fallback to dill).
anyway, glad you got it working
Hi,
When using this tool I tend to save the weights as a pickle file. However when loading them back I see things like:
which most likely mean that the version used to train the GBReweighter and pickle it is different from the version I am using now. In practice this pickle file will be useless now, unless I can find the version I used. This is very tedious and dangerous, is there a way that the actual information, not the object, be saved to text?