opea-project / GenAIComps

GenAI components at micro-service level; GenAI service composer to create mega-service
Apache License 2.0
74 stars 136 forks source link

Neo4J Retriever Microservice: Cant start with Python due to TypeError: cannot pickle '_thread.RLock' object #841

Open ajaykallepalli opened 2 weeks ago

ajaykallepalli commented 2 weeks ago

Similar to Github Issue: https://github.com/opea-project/GenAIComps/issues/842

Full error message:

(venv) ➜  langchain git:(main) ✗ python retriever_neo4j.py
/Users/ajaykallepalli/Documents/GitHub/GenAIComps/venv/lib/python3.12/site-packages/pydantic/_internal/_fields.py:132: UserWarning: Field "model_name_or_path" in Audio2TextDoc has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
 warnings.warn(
[2024-10-31 10:15:28,230] [    INFO] - Base service - CORS is enabled.
[2024-10-31 10:15:28,230] [    INFO] - Base service - Setting up HTTP server
[2024-10-31 10:15:28,231] [    INFO] - Base service - Uvicorn server setup on port 7002
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:7002 (Press CTRL+C to quit)
[2024-10-31 10:15:28,234] [    INFO] - Base service - HTTP server setup successful
Traceback (most recent call last):
 File "/Users/ajaykallepalli/Documents/GitHub/GenAIComps/comps/retrievers/neo4j/langchain/retriever_neo4j.py", line 118, in <module>
   opea_microservices["opea_service@retriever_neo4j"].start()
 File "/Users/ajaykallepalli/Documents/GitHub/GenAIComps/comps/cores/mega/micro_service.py", line 122, in start
   self.process.start()
 File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/process.py", line 121, in start
   self._popen = self._Popen(self)
                 ^^^^^^^^^^^^^^^^^
 File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/context.py", line 224, in _Popen
   return _default_context.get_context().Process._Popen(process_obj)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/context.py", line 289, in _Popen
   return Popen(process_obj)
          ^^^^^^^^^^^^^^^^^^
 File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/popen_spawn_posix.py", line 32, in __init__
   super().__init__(process_obj)
 File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/popen_fork.py", line 19, in __init__
   self._launch(process_obj)
 File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/popen_spawn_posix.py", line 47, in _launch
   reduction.dump(process_obj, fp)
 File "/opt/homebrew/anaconda3/lib/python3.12/multiprocessing/reduction.py", line 60, in dump
   ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle '_thread.RLock' object

Steps Taken:

  1. Clone the repo
  2. Create venv, pip install -e .
  3. move to comps/dataprep/neo4j/langchain/
  4. Follow readme
    • pip install requirements
    • Run docker setup for python (login and access to neo4j through local host is working)
    • Set Neo4J environment variables and pythonpath
    • Run Python script
docker run \
    -p 7474:7474 -p 7687:7687 \
    -v $PWD/data:/data -v $PWD/plugins:/plugins \
    --name neo4j-apoc \
    -d \
    -e NEO4J_AUTH=neo4j/password \
    -e NEO4J_PLUGINS=\[\"apoc\"\]  \
    neo4j:latest

 export NEO4J_URI="bolt://localhost:7687"
export NEO4J_USERNAME="neo4j"
export NEO4J_PASSWORD="password"

python retriever_neo4j.py

Final Error Message

Caused by running python retriever_neo4j.py is given at the top of this file. This occurs after server runs for 1 or 2 seconds.

Please let me know if any other information is required.

Environment Information

MacOS: Sequoia 15.0.1 Macbook Pro M1 chip, 32 GB RAM Python Version: Python 3.12.4

yinghu5 commented 1 week ago

Hi Ajaykallepalli thank you a lot for reporting the problem. Have you tried to build the same component on some linux machine as we don't test Mac OS, not sure if it is work or not.

Here is some reference:

OPEA : recommended hardware and basic software setup.

Hardware Requirements: For the hardware configuration, If you need Hardware Access visit the Intel Tiber Developer Cloud to select from options such as Xeon or Gaudi processors that meet the necessary specifications.

If you are deploying it on cloud, say AWS, select a VM instance from R7iz or m7i family of instances with base OS as Ubuntu 22.04 (AWS ami id : ami-05134c8ef96964280).

thanks