showlab / VLog

Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.
MIT License
528 stars 26 forks source link

Fix TypeError: cannot pickle '_thread.RLock' object by using dill #12

Open M117n opened 3 weeks ago

M117n commented 3 weeks ago

Description: Problem Overview When attempting to serialize the FAISS vectorstore object using pickle, a TypeError occurs:

TypeError: cannot pickle '_thread.RLock' object

This error arises because pickle cannot serialize objects that contain threading locks, which are present within the FAISS object. This issue prevents saving and loading the vectorstore efficiently, impacting the performance and usability of the application.

Solution To resolve this issue, I replaced the usage of Python's built-in pickle module with the dill library, which is capable of serializing a wider range of Python objects, including those with threading locks.

Changes Made:

Imported dill and aliased it as pickle to minimize code changes:

import dill as pickle

Updated all instances where pickle was used for serialization and deserialization:

Serialization

with open(pkl_path, "wb") as f: pickle.dump(vectorstore, f)

Deserialization

with open(pkl_path, 'rb') as file: self.vectorstore = pickle.load(file)

Impact: These changes allow for proper serialization and deserialization of the FAISS vectorstore, preventing the TypeError and improving the stability of the LLM reasoning process when handling large datasets or complex documents.

Please review and let me know if any further adjustments are needed. Thank you!