SciPhi-AI / R2R

R2R is a prod-ready RAG (Retrieval-Augmented Generation) engine with a RESTful API. R2R includes hybrid search, knowledge graphs, and more.
https://r2r-docs.sciphi.ai/
MIT License
2.26k stars 151 forks source link

2024-06-26 21:20:54,815 - WARNING - r2r.vecs.client - Database connection error: name 'sqlalchemy' is not defined. Retrying in 1 seconds... #545

Open thistleknot opened 4 days ago

thistleknot commented 4 days ago

Describe the bug A clear and concise description of what the bug is.

pip install sqlalchemy

python r2r/examples/quickstart.py

emrgnt-cmplxty commented 4 days ago

Strange. we have not encountered this bug before.

sqlalchemy is a listed dependency in the pyproject.toml.

Could you provide a few more steps so that we might attempt to reproduce?

EDIT - I thought about this a bit more and i think I've seen this issue once or twice when the target database is not configured as expected. Can you confirm that you have installed pgvector alongside your Postgres database and that it is online?

JeongYunLee commented 4 days ago

Hi, I have a same problem. I install pgvector with docker and I think I runs well. Is there any other solutions?

image

emrgnt-cmplxty commented 3 days ago

Hi, I have a same problem. I install pgvector with docker and I think I runs well. Is there any other solutions?

image

The framework requires pgvector to be installed alongside Postgres.

Are you saying that you found this problem and installing pgvector fixed the issue for you?

JeongYunLee commented 3 days ago

No, I installed Postgres and pgvector, but I got the same error when I run python r2r/examples/quickstart.py. I checked the .env file and filled every part.

JeongYunLee commented 3 days ago

oh I solved the problem. You were right, it was a pgvector installing problem. I reinstall both postgres and pgvector and it works well. Thank you so much!

thistleknot commented 2 days ago

okay, just got pgvector and postgres-16 installed on oracle linux 8, confirmed I was able to create a vector column,

but still getting an sqlalchemy error

(textgen) [root@pve0 R2R]# python -m r2r.examples.quickstart ingest_files
2024-06-28 20:41:43,267 - INFO - r2r.core.providers.vector_db_provider - Initializing VectorDBProvider with config extra_fields={} provider='pgvector' collection_name='demo_vecs'.
2024-06-28 20:41:43,347 - WARNING - r2r.vecs.client - Failed to create extension: (psycopg2.errors.FeatureNotSupported) extension "pg_trgm" is not available
DETAIL:  Could not open extension control file "/usr/pgsql-16/share/extension/pg_trgm.control": No such file or directory.
HINT:  The extension must first be installed on the system where PostgreSQL is running.

[SQL: CREATE EXTENSION IF NOT EXISTS pg_trgm;]
(Background on this error at: https://sqlalche.me/e/20/tw8g)
2024-06-28 20:41:43,349 - WARNING - r2r.vecs.client - Database connection error: name 'sqlalchemy' is not defined. Retrying in 1 seconds...
2024-06-28 20:41:44,354 - WARNING - r2r.vecs.client - Failed to create extension: (psycopg2.errors.FeatureNotSupported) extension "pg_trgm" is not available
DETAIL:  Could not open extension control file "/usr/pgsql-16/share/extension/pg_trgm.control": No such file or directory.
HINT:  The extension must first be installed on the system where PostgreSQL is running.

[SQL: CREATE EXTENSION IF NOT EXISTS pg_trgm;]
(Background on this error at: https://sqlalche.me/e/20/tw8g)
2024-06-28 20:41:44,355 - WARNING - r2r.vecs.client - Database connection error: name 'sqlalchemy' is not defined. Retrying in 1 seconds...
2024-06-28 20:41:45,359 - WARNING - r2r.vecs.client - Failed to create extension: (psycopg2.errors.FeatureNotSupported) extension "pg_trgm" is not available
DETAIL:  Could not open extension control file "/usr/pgsql-16/share/extension/pg_trgm.control": No such file or directory.
HINT:  The extension must first be installed on the system where PostgreSQL is running.

[SQL: CREATE EXTENSION IF NOT EXISTS pg_trgm;]
(Background on this error at: https://sqlalche.me/e/20/tw8g)
2024-06-28 20:41:45,361 - WARNING - r2r.vecs.client - Database connection error: name 'sqlalchemy' is not defined. Retrying in 1 seconds...
2024-06-28 20:41:46,362 - ERROR - r2r.vecs.client - Failed to initialize database after 3 retries with error: name 'sqlalchemy' is not defined
Traceback (most recent call last):
  File "/data/R2R/r2r/providers/vector_dbs/pgvector/pgvector_db.py", line 45, in __init__
    self.vx: Client = r2r.vecs.create_client(DB_CONNECTION)
  File "/data/R2R/r2r/vecs/__init__.py", line 28, in create_client
    return Client(connection_string, *args, **kwargs)
  File "/data/R2R/r2r/vecs/client.py", line 73, in __init__
    self._initialize_database()
  File "/data/R2R/r2r/vecs/client.py", line 96, in _initialize_database
    raise RuntimeError(error_message)
RuntimeError: Failed to initialize database after 3 retries with error: name 'sqlalchemy' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/miniconda3/envs/textgen/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/miniconda3/envs/textgen/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/data/R2R/r2r/examples/quickstart.py", line 593, in <module>
    fire.Fire(R2RQuickstart)
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/data/R2R/r2r/examples/quickstart.py", line 71, in __init__
    self.r2r_app = get_r2r_app(app_builder=R2RAppBuilder(config))
  File "/data/R2R/r2r/main/dependencies.py", line 13, in get_r2r_app
    r2r_app_instance = builder.build()
  File "/data/R2R/r2r/main/assembly/builder.py", line 167, in build
    providers = provider_factory(self.config).create_providers(
  File "/data/R2R/r2r/main/assembly/factory.py", line 179, in create_providers
    or self.create_vector_db_provider(
  File "/data/R2R/r2r/main/assembly/factory.py", line 40, in create_vector_db_provider
    vector_db_provider = PGVectorDB(vector_db_config)
  File "/data/R2R/r2r/providers/vector_dbs/pgvector/pgvector_db.py", line 47, in __init__
    raise ValueError(
ValueError: Error Failed to initialize database after 3 retries with error: name 'sqlalchemy' is not defined occurred while attempting to connect to the pgvector provider with postgresql://postgres:@192.168.3.212:5432/vectordb.
thistleknot commented 2 days ago

hmm...

did a pip install -e . from source and a whole different error

(textgen) [root@pve0 R2R]# python -m r2r.examples.quickstart ingest_files
2024-06-28 20:48:44,846 - INFO - r2r.base.providers.vector_db_provider - Initializing VectorDBProvider with config extra_fields={} provider='pgvector'.
Traceback (most recent call last):
  File "/root/miniconda3/envs/textgen/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/miniconda3/envs/textgen/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/data/R2R/r2r/examples/quickstart.py", line 600, in <module>
    fire.Fire(R2RQuickstart)
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/data/R2R/r2r/examples/quickstart.py", line 71, in __init__
    self.app = R2R(config=config)
  File "/data/R2R/r2r/main/r2r.py", line 37, in __init__
    built = builder.build()
  File "/data/R2R/r2r/main/assembly/builder.py", line 170, in build
    providers = provider_factory(self.config).create_providers(
  File "/data/R2R/r2r/main/assembly/factory.py", line 177, in create_providers
    or self.create_vector_db_provider(
  File "/data/R2R/r2r/main/assembly/factory.py", line 42, in create_vector_db_provider
    vector_db_provider = PGVectorDB(vector_db_config)
  File "/data/R2R/r2r/providers/vector_dbs/pgvector/pgvector_db.py", line 75, in __init__
    raise ValueError(
ValueError: Error, please set a valid POSTGRES_VECS_COLLECTION environment variable or set a 'collection' in the 'vector_database' settings of your `config.json`.
(textgen) [root@pve0 R2R]#

but a different error is a better error

thistleknot commented 2 days ago

after adding

export POSTGRES_VECS_COLLECTION=demo_vecs

I'm back at the same error.

thistleknot commented 2 days ago

alright, installed on my psql

sudo apt-get install postgresql-contrib

and now

2024-06-28 21:24:46,585 - INFO - r2r.base.providers.vector_db_provider - Initializing VectorDBProvider with config extra_fields={} provider='pgvector'. Traceback (most recent call last): File "/root/miniconda3/envs/textgen/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/root/miniconda3/envs/textgen/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/data/R2R/r2r/examples/quickstart.py", line 600, in fire.Fire(R2RQuickstart) File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/data/R2R/r2r/examples/quickstart.py", line 71, in init self.app = R2R(config=config) File "/data/R2R/r2r/main/r2r.py", line 37, in init built = builder.build() File "/data/R2R/r2r/main/assembly/builder.py", line 170, in build providers = provider_factory(self.config).create_providers( File "/data/R2R/r2r/main/assembly/factory.py", line 181, in create_providers or self.create_embedding_provider( File "/data/R2R/r2r/main/assembly/factory.py", line 68, in create_embedding_provider from r2r.providers.embeddings import OpenAIEmbeddingProvider File "/data/R2R/r2r/providers/embeddings/init.py", line 1, in from .ollama.ollama_base import OllamaEmbeddingProvider File "/data/R2R/r2r/providers/embeddings/ollama/ollama_base.py", line 13, in class OllamaEmbeddingProvider(EmbeddingProvider): File "/data/R2R/r2r/providers/embeddings/ollama/ollama_base.py", line 46, in OllamaEmbeddingProvider async def execute_task_with_backoff(self, task: dict[str, Any]): NameError: name 'Any' is not defined. Did you mean: 'any'?

thistleknot commented 2 days ago

after modifying

/data/R2R/r2r/providers/embeddings/ollama/ollama_base.py
    from typing import Any

I got further. Looks like it's running now.

Now time to figure out how to properly patch in litellm to point to text-generation-webui's proxy (i think I got the right edit in under base_litellm.py in args under _get_base_args

    ../R2R/r2r/providers/llms/litellm/base_litellm.py

    def _get_base_args(self, generation_config: GenerationConfig, prompt=None) -> dict:
        args = {
            # ... (other arguments)
            "api_base": "http://192.168.3.17:5000/v1",  # Your provider's API base
        }
        return args