uber / causalml

Uplift modeling and causal inference with machine learning algorithms
Other
4.87k stars 756 forks source link

Issues with Serializing UpliftTreeClassifier using pickle in Python #745

Closed zerolxf closed 3 months ago

zerolxf commented 3 months ago

Hi everyone,

I've encountered a serialization issue when trying to use pickle to serialize an UpliftTreeClassifier object from the causalml library. The specific error message I receive is:

Can't pickle <cyfunction UpliftTreeClassifier.evaluate_KL at 0x7f735ff5f1f0>: attribute lookup evaluate_KL on causalml.inference.tree.uplift failed

This suggests that there's a problem with pickling functions or objects defined in C extensions, which seems to be the case with the UpliftTreeClassifier.

Here's the code snippet where the issue occurs:

import pickle

with open('uplift_model.pkl', 'wb') as file:
    pickle.dump(uplift_model, file)

What I've tried so far:

My questions are:

  1. Has anyone else encountered this issue, and how did you resolve it?
  2. Are there alternative serialization methods or libraries that are known to work well with causalml models?
  3. Would manually extracting model parameters and reconstructing the model be the only workaround? If so, could anyone share a general approach or example for doing this with UpliftTreeClassifier?

Any advice, code snippets, or resources would be greatly appreciated. Thank you in advance for your help!

jeongyoonlee commented 3 months ago

Hi @zerolxf, I just tried your code snippet on Google Colab with the uplift trees with synthetic data notebook, and it worked fine. You can run the Colab notebook here. Cell 21 and onward contains the serialization and deserialization of the uplift tree model.

A few things to check. Are you using the latest version of CausalML, which is 0.15.0? Have you tried it in a clean environment? Also, try reinstalling Cython, numpy and causalml. Hope it helps.