SylphAI-Inc / AdalFlow

AdalFlow: The library to build & auto-optimize LLM applications.
http://adalflow.sylph.ai/
MIT License
2.03k stars 181 forks source link

[Need fix] after generator went through optimization update, the pickle state does not work any more #192

Open liyin2015 opened 2 months ago

liyin2015 commented 2 months ago
# pickle file

# save the serialized states to a file
from adalflow.utils.file_io import save_pickle
states = doc.to_dict()
# save_json(states, "doc.json")
save_pickle(states, "doc.pkl")

# load the serialized states from a file
from adalflow.utils.file_io import load_pickle
states_loaded = load_pickle("doc.pkl")

print(states_loaded == states)

doc_pickle = DocQA.from_dict(states_loaded)

AttributeError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/adalflow/utils/file_io.py in save_pickle(obj, f) 64 with open(f, "wb") as file: ---> 65 pickle.dump(obj, file) 66 except Exception as e:

AttributeError: Can't pickle local object 'Generator.set_data_map_func..default_map_func'

During handling of the above exception, another exception occurred:

Exception Traceback (most recent call last) 1 frames /usr/local/lib/python3.10/dist-packages/adalflow/utils/file_io.py in save_pickle(obj, f) 65 pickle.dump(obj, file) 66 except Exception as e: ---> 67 raise Exception(f"Error saving object to pickle file {f}: {e}") 68 69

Exception: Error saving object to pickle file doc.pkl: Can't pickle local object 'Generator.set_data_map_func..default_map_func'

Will need to have a customized to_dict and 'from_dict` so that we can exclude unpicklable object and restore it at from_dict loading

sameeramin commented 1 month ago

@liyin2015 Does this issue persist? I'm unable to reproduce it. Could you please add a bit more steps to reproduce this bug?

jaggiK commented 2 weeks ago

I get this error by simply running component.ipynb notebook on Google Colab.

I attempted several methods for pickling (joblib, cloudpickle, pickle5, and AdalFlow's save_pickle) but none were successful. Serialization with PySpark and saving as JSON also failed.

The issue seems related to the structuring of class members with nested functions; addressing this led to a new issue with thread locks.

Is there currently support for saving models or components?