PennyWieser / Thermobar

Python thermobarometry tool
41 stars 10 forks source link

Struggles with P_Jorgenson2022_Cpx_only #41

Closed dyspence closed 10 months ago

dyspence commented 11 months ago

Hello,

After following the instructions at [https://thermobar.readthedocs.io/en/latest/Examples/Cpx_Cpx_Liq_Thermobarometry/MachineLearning_Cpx_Liq_Thermobarometry.html] I am still struggling to get P_Jorgenson2022_Cpxonly to work. I used #!pip install "https://github.com/PennyWieser/Thermobar_onnx/archive/refs/tags/v.0.0.4.zip"_ and import joblib as j and still had no luck.

I end up with a long error that looks like this:

KeyError Traceback (most recent call last) Input In [22], in <cell line: 1>() ----> 1 P_Jorgenson2022_Cpx_only=pt.calculate_cpx_only_press(cpx_comps=Cpxs, equationP="P_Jorgenson2022_Cpx_only") 2 P_Jorgenson2022_Cpx_only

File ~/opt/anaconda3/lib/python3.9/site-packages/Thermobar/clinopyroxene_thermobarometry.py:3309, in calculate_cpx_only_press(cpx_comps, equationP, T, H2O_Liq, eq_tests, return_input) 3301 # if 3302 # df_stats=P_Petrelli2020_Cpx_only_withH2O(cpx_comps=cpx_comps_c) 3303 # P_kbar=df_stats['P_kbar_calc'] (...) 3306 # df_stats=P_Petrelli2020_Cpx_only_noCr(cpx_comps=cpx_comps_c) 3307 # P_kbar=df_stats['P_kbar_calc'] 3308 if ('Petrelli' in equationP or "Jorgenson" in equationP) and "onnx" not in equationP: -> 3309 df_stats=func(cpx_comps=cpx_comps_c) 3311 elif ('Petrelli' in equationP or "Jorgenson" in equationP) and "onnx" in equationP: 3312 P_kbar=func(cpx_comps=cpx_comps_c)

File ~/opt/anaconda3/lib/python3.9/site-packages/Thermobar/clinopyroxene_thermobarometry.py:1341, in P_Jorgenson2022_Cpx_only(T, cpx_comps) 1338 Thermobar_dir=Path(Thermobar_onnx.file).parent 1340 with open(Thermobar_dir/'ETR_Press_Jorg21_Cpx_only_NotNorm_sklearn_1_3.pkl', 'rb') as f: -> 1341 ETR_Press_J21_Cpx_only=joblib.load(f) 1345 Pred_P_kbar=ETR_Press_J21_Cpx_only.predict(x_test) 1347 df_stats, df_voting=get_voting_stats_ExtraTreesRegressor(x_test, ETR_Press_J21_Cpx_only)

File ~/opt/anaconda3/lib/python3.9/site-packages/joblib/numpy_pickle.py:577, in load(filename, mmap_mode) 575 filename = getattr(fobj, 'name', '') 576 with _read_fileobject(fobj, filename, mmap_mode) as fobj: --> 577 obj = _unpickle(fobj) 578 else: 579 with open(filename, 'rb') as f:

File ~/opt/anaconda3/lib/python3.9/site-packages/joblib/numpy_pickle.py:506, in _unpickle(fobj, filename, mmap_mode) 504 obj = None 505 try: --> 506 obj = unpickler.load() 507 if unpickler.compat_mode: 508 warnings.warn("The file '%s' has been generated with a " 509 "joblib version less than 0.10. " 510 "Please regenerate this pickle file." 511 % filename, 512 DeprecationWarning, stacklevel=3)

File ~/opt/anaconda3/lib/python3.9/pickle.py:1212, in _Unpickler.load(self) 1210 raise EOFError 1211 assert isinstance(key, bytes_types) -> 1212 dispatchkey[0] 1213 except _Stop as stopinst: 1214 return stopinst.value

KeyError: 0

Any advice would be greatly appreciated.

Cheers, Dylan

PennyWieser commented 11 months ago

Hi Dylan What version of sklearn are you on? Compatibility with pickles and sklearn is an ongoing issue if you have an older version (if you are already up to date I probably need to remove the joblib dependency instead). You can also try the onnx version (see volcanica paper for more explanation of the problem here).

dyspence commented 11 months ago

Hi Penny,

Thank you for the quick response! I'm pretty new to Python, so I have been essentially just following the steps without much knowledge of what's happening in the background. How can I check what version of sklearn I am on? And how can I update this if it is a problem?

The onnx version does seem to work, but it would be nice to have the voting option.

dyspence commented 11 months ago

I just looked through the installation text and it seems like I have scikit-learn version 1.0.2

PennyWieser commented 11 months ago

Hi Dylan, thats a pretty old version of sklearn (2021). Would you be able to try upgrading? Not sure if you use ChatGPT but its super helpful for these sort of things - like if you have python installed thro conda, you can ask it 'I have python thro anaconda, how can I upgrade my sklearn'. Or else you can do with pip in your command line. Another optoin (that you might want to do anyway) is to do a conda upgrade all (again chatgpt can help) as chances are all your packages are 2 yrs old if sklearn is.

PennyWieser commented 10 months ago

Hi Dylan any luck with this on a newer version of sklearn?

dyspence commented 10 months ago

Hi Penny,

Thank's for the response. I'm currently working through issues with updating sklearn (and anaconda itself). I will let you know what happens when I get that all sorted out.

dyspence commented 10 months ago

Hi Penny,

I did a conda upgrade all, and found a bunch of inconsistencies I had to work through, so I'm glad I did this. I am still getting an error executing this code: P_Jorgenson2022_Cpx_only=pt.calculate_cpx_only_press(cpx_comps=Cpxs, equationP="P_Jorgenson2022_Cpx_only") P_Jorgenson2022_Cpx_only

Here is the error: /Users/dylanspence/opt/anaconda3/lib/python3.9/site-packages/sklearn/base.py:318: UserWarning: Trying to unpickle estimator ExtraTreeRegressor from version 1.3.0 when using version 1.2.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to: https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations warnings.warn(

ValueError Traceback (most recent call last) Cell In[10], line 1 ----> 1 P_Jorgenson2022_Cpx_only=pt.calculate_cpx_only_press(cpx_comps=Cpxs, equationP="P_Jorgenson2022_Cpx_only") 2 P_Jorgenson2022_Cpx_only

File ~/opt/anaconda3/lib/python3.9/site-packages/Thermobar/clinopyroxene_thermobarometry.py:3309, in calculate_cpx_only_press(cpx_comps, equationP, T, H2O_Liq, eq_tests, return_input) 3301 # if 3302 # df_stats=P_Petrelli2020_Cpx_only_withH2O(cpx_comps=cpx_comps_c) 3303 # P_kbar=df_stats['P_kbar_calc'] (...) 3306 # df_stats=P_Petrelli2020_Cpx_only_noCr(cpx_comps=cpx_comps_c) 3307 # P_kbar=df_stats['P_kbar_calc'] 3308 if ('Petrelli' in equationP or "Jorgenson" in equationP) and "onnx" not in equationP: -> 3309 df_stats=func(cpx_comps=cpx_comps_c) 3311 elif ('Petrelli' in equationP or "Jorgenson" in equationP) and "onnx" in equationP: 3312 P_kbar=func(cpx_comps=cpx_comps_c)

File ~/opt/anaconda3/lib/python3.9/site-packages/Thermobar/clinopyroxene_thermobarometry.py:1341, in P_Jorgenson2022_Cpx_only(T, cpx_comps) 1338 Thermobar_dir=Path(Thermobar_onnx.file).parent 1340 with open(Thermobar_dir/'ETR_Press_Jorg21_Cpx_only_NotNorm_sklearn_1_3.pkl', 'rb') as f: -> 1341 ETR_Press_J21_Cpx_only=joblib.load(f) 1345 Pred_P_kbar=ETR_Press_J21_Cpx_only.predict(x_test) 1347 df_stats, df_voting=get_voting_stats_ExtraTreesRegressor(x_test, ETR_Press_J21_Cpx_only)

File ~/opt/anaconda3/lib/python3.9/site-packages/joblib/numpy_pickle.py:648, in load(filename, mmap_mode) 646 filename = getattr(fobj, 'name', '') 647 with _read_fileobject(fobj, filename, mmap_mode) as fobj: --> 648 obj = _unpickle(fobj) 649 else: 650 with open(filename, 'rb') as f:

File ~/opt/anaconda3/lib/python3.9/site-packages/joblib/numpy_pickle.py:577, in _unpickle(fobj, filename, mmap_mode) 575 obj = None 576 try: --> 577 obj = unpickler.load() 578 if unpickler.compat_mode: 579 warnings.warn("The file '%s' has been generated with a " 580 "joblib version less than 0.10. " 581 "Please regenerate this pickle file." 582 % filename, 583 DeprecationWarning, stacklevel=3)

File ~/opt/anaconda3/lib/python3.9/pickle.py:1212, in _Unpickler.load(self) 1210 raise EOFError 1211 assert isinstance(key, bytes_types) -> 1212 dispatchkey[0] 1213 except _Stop as stopinst: 1214 return stopinst.value

File ~/opt/anaconda3/lib/python3.9/site-packages/joblib/numpy_pickle.py:402, in NumpyUnpickler.load_build(self) 394 def load_build(self): 395 """Called to set the state of a newly created object. 396 397 We capture it to replace our place-holder objects, NDArrayWrapper or (...) 400 NDArrayWrapper is used for backward compatibility with joblib <= 0.9. 401 """ --> 402 Unpickler.load_build(self) 404 # For backward compatibility, we support NDArrayWrapper objects. 405 if isinstance(self.stack[-1], (NDArrayWrapper, NumpyArrayWrapper)):

File ~/opt/anaconda3/lib/python3.9/pickle.py:1717, in _Unpickler.load_build(self) 1715 setstate = getattr(inst, "setstate", None) 1716 if setstate is not None: -> 1717 setstate(state) 1718 return 1719 slotstate = None

File sklearn/tree/_tree.pyx:676, in sklearn.tree._tree.Tree.setstate()

File sklearn/tree/_tree.pyx:1364, in sklearn.tree._tree._check_node_ndarray()

ValueError: node array from the pickle has an incompatible dtype:

I notice that scikit-learn is on version 1.3 and when I upgraded it only moved it from version 1.0.2 to 1.2.2. Should I manually try to upgrade to 1.3?

Thanks again for all the help!

PennyWieser commented 10 months ago

Yes if you can manually try upgrading to 1.3 that would be great. If this persists on that let me know. Trying to find a version that works to make it a requirement moving forwarss! This is why non onnx machine learning models are super fun....

dyspence commented 10 months ago

Hi Penny,

After updating to scikit-learn version 1.3.0 I finally had success!

Thank you so much for the help.

PennyWieser commented 10 months ago

Awesome! Glad to hear it. Python is super fun sometimes isn't it with package management....