Dataherald / dataherald

Interact with your SQL database, Natural Language to SQL using LLMs
https://dataherald.readthedocs.io/en/latest/
Apache License 2.0
3.3k stars 230 forks source link

POST [/api/v1/question](/docs#/Question/answer_question) : Internal Server Error : #182

Closed khaianis closed 11 months ago

khaianis commented 11 months ago

Hi hope u are below i m facing this errors when calling : POST /api/v1/question parameter : { "db_connection_id": "XXXXXXXX", "question": "list all YYYY" }

Code | Details 500 Undocumented | Error: Internal Server Error :

Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in call raise e File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in call await self.app(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 66, in app response = await func(request) ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 241, in app raw_response = await run_endpoint_function( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 169, in run_endpoint_function return await run_in_threadpool(dependant.call, *values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool return await anyio.to_thread.run_sync(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2106, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 833, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/dataherald/server/fastapi/init.py", line 151, in answer_question return self._api.answer_question(question_request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/dataherald/api/fastapi.py", line 91, in answer_question context_store = self.system.instance(ContextStore) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/dataherald/config.py", line 105, in instance impl = type(self) ^^^^^^^^^^ File "/app/dataherald/context_store/default.py", line 17, in init super().init(system) File "/app/dataherald/context_store/init.py", line 23, in init self.vector_store = self.system.instance(VectorStore) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/dataherald/config.py", line 102, in instance type = get_class(fqn, type) ^^^^^^^^^^^^^^^^^^^^ File "/app/dataherald/config.py", line 119, in get_class module_name, class_name = fqn.rsplit(".", 1) ^^^^^^^^^^^^^^^^^^^^^^^ ValueError: not enough values to unpack (expected 2, got 1)

jcjc712 commented 11 months ago

Hi, can you check if you have this envvar in your .env file

VECTOR_STORE = 'dataherald.vector_store.chroma.Chroma'

If so, can you re-build your container docker-compose up --build

Please let me know if it worked

khaianis commented 11 months ago

Sorry still having same problems ( same as golden records ) i will share my env file :

Openai info. All these fields are required for the engine to work.

OPENAI_API_KEY = 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 ORG_ID = org-2lIj9QSHoDgng9A5TKoWPWNr LLM_MODEL = 'gpt-4-32k' #the openAI llm model that you want to use. possible values: gpt-4-32k, gpt-4, gpt-3.5-turbo, gpt-3.5-turbo-16k

Encryption key for storing DB connection data in Mongo

ENCRYPT_KEY = 'GnTZMGBkKc2Tf9OOmAv9gIXIF5DqPdnacsrdxLWMeQo='

GOLDEN_RECORD_COLLECTION = mygoldenrecords

Pinecone info. These fields are required if the vector store used is Pinecone

PINECONE_API_KEY = PINECONE_ENVIRONMENT =

Module implementations to be used names for each required component. You can use the default ones or create your own

API_SERVER = "dataherald.api.fastapi.FastAPI" SQL_GENERATOR = "dataherald.sql_generator.dataherald_sqlagent.DataheraldSQLAgent" EVALUATOR = "dataherald.eval.simple_evaluator.SimpleEvaluator" DB = "dataherald.db.mongo.MongoDB" VECTOR_STORE = 'dataheraldvector_storechromaChroma' CONTEXT_STORE = 'dataherald.context_store.default.DefaultContextStore' # Set a context store class, the default one is DefaultContextStore DB_SCANNER = 'dataherald.db_scanner.sqlalchemy.SqlAlchemyScanner'

mongo database information

MONGODB_URI = "mongodb://admin:admin@mongodb:27017" MONGODB_DB_NAME = dataherald MONGODB_DB_USERNAME = admin MONGODB_DB_PASSWORD = admin

The enncryption key is used to encrypt database connection info before storing in Mongo. Please refer to the README on how to set it.

S3_AWS_ACCESS_KEY_ID= S3_AWS_SECRET_ACCESS_KEY=

khaianis commented 11 months ago

in the other parts i m trying to build environment in order to debug application locally but maybe we need to init data in mongodb herald and we don't have this part we have just : mongo_data/data and i don't know if we can use this file on my own mongodb to initiate environment Thanks in advance for ure answer

jcjc712 commented 11 months ago

Hi @khaianis, in your envvar your have this one VECTOR_STORE = 'dataheraldvector_storechromaChroma' And it should be VECTOR_STORE = 'dataherald.vector_store.chroma.Chroma' It should have the dots because internally when it is initialized it split the string by dots and take the class name.

Once you make this change re-build your docker container. Let me know if this worked.

jcjc712 commented 11 months ago

Once your MongoDB container is running you can connect to this one following the next instructions. https://github.com/Dataherald/dataherald#connect-to-docker-mongodb-container

khaianis commented 11 months ago

Sorry for sharing the others config file this is my config: VECTOR_STORE = "dataherald.vector_store.chroma.Chroma" For windows u need to download this : https://visualstudio.microsoft.com/fr/visual-cpp-build-tools/ in order to download chroma required in requirements.txt

after making this modification we still have this issues : for post : /api/v1/golden-records :

traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in call raise e File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in call await self.app(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 66, in app response = await func(request) ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 241, in app raw_response = await run_endpoint_function( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 169, in run_endpoint_function return await run_in_threadpool(dependant.call, *values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool return await anyio.to_thread.run_sync(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2106, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 833, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/dataherald/server/fastapi/init.py", line 218, in add_golden_records created_records = self._api.add_golden_records(golden_records) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/dataherald/api/fastapi.py", line 208, in add_golden_records return context_store.add_golden_records(golden_records) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/dataherald/context_store/default.py", line 65, in add_golden_records self.vector_store.add_record( File "/app/dataherald/vector_store/chroma.py", line 47, in add_record onnxruntime.InferenceSession(path_or_bytes= '', providers=['CPUExecutionProvider']) File "/usr/local/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in init self._create_inference_session(providers, provider_options, disabled_optimizers) File "/usr/local/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 462, in _create_inference_session sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: init(): incompatible constructor arguments. The following argument types are supported:

  1. onnxruntime.capi.onnxruntime_pybind11_state.InferenceSession(arg0: onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions, arg1: str, arg2: bool, arg3: bool)

Invoked with: <onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions object at 0x7fb219b109b0>, None, False, False

and For api/v1/nl-query-responses

Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in call raise e File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in call await self.app(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 66, in app response = await func(request) ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 241, in app raw_response = await run_endpoint_function( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 169, in run_endpoint_function return await run_in_threadpool(dependant.call, *values) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool return await anyio.to_thread.run_sync(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2106, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 833, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/dataherald/server/fastapi/init.py", line 209, in get_nl_query_response return self._api.get_nl_query_response(query_request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/dataherald/api/fastapi.py", line 267, in get_nl_query_response nl_query_response.sql_query = query_request.sql_query ^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'sql_query'

khaianis commented 11 months ago

Hi @khaianis, in your envvar your have this one VECTOR_STORE = 'dataheraldvector_storechromaChroma' And it should be VECTOR_STORE = 'dataherald.vector_store.chroma.Chroma' It should have the dots because internally when it is initialized it split the string by dots and take the class name.

Once you make this change re-build your docker container. Let me know if this worked.

I have this configuration with . i m just sharing older files Thanks in advance

ppmarkus commented 11 months ago

out of interest, what python/pip packages do you have installed (should be in the requirements.txt file) ?

your issue seems similar to this one that was closed. https://github.com/Dataherald/dataherald/issues/174

khaianis commented 11 months ago

i make upgrade for chroma with update code in chroma.py still have : class Chroma(VectorStore): File "C:\dataherald\dataherald-main\dataherald\vector_store\chroma.py", line 40, in Chroma def add_record(self, documents: str, collection: str, metadata: Any, ids: List): File "**\lib\site-packages\overrides\overrides.py", line 143, in override return _overrides(method, check_signature, check_at_runtime) File "*\lib\site-packages\overrides\overrides.py", line 170, in _overrides _validate_method(method, super_class, check_signature) File "**\lib\site-packages\overrides\overrides.py", line 189, in _validate_method ensure_signature_is_compatible(super_method, method, is_static) File "***\lib\site-packages\overrides\signature.py", line 103, in ensure_signature_is_compatible ensure_all_kwargs_defined_in_sub( File "**\lib\site-packages\overrides\signature.py", line 164, in ensure_all_kwargs_defined_in_sub raise TypeError( TypeError: Chroma.add_record: ids must be a supertype oftyping.Listbut istyping.List`

ppmarkus commented 11 months ago

it'd be useful to see exactly what packages you have installed. I'm assuming you get this error when you try to add a new golden record (and not asking a question?)

This error is thrown from the overrides package.

The only thing I can really see is that signiture is slightly different between the base class VectorStore, and that in DH Chroma sub class.

VectorStore

@abstractmethod
def add_record(
    self, documents: str, collection: str, metadata: Any, ids: List = None
):

Chroma

@override      
def add_record(self, documents: str, collection: str, metadata: Any, ids: List):

you could try to change the signature in Chroma to be the same with the optional = None for ids.

khaianis commented 11 months ago

Hi ppmarkus thank u it worked with ure modification : @override def add_record(self, documents: str, collection: str, metadata: Any, ids: List = None): with this signatue it's ok Thanks .

ppmarkus commented 11 months ago

Hi @khaianis Sorry for the late reply. No problem. I suspect that if you try to rebuild the system again with the requirements.txt with an appropriate python version it should work. I say this because I and others don't experience that error. I think it should be fine to close this issue.