langchain-ai / opengpts

MIT License
6.31k stars 829 forks source link

Specifying endpoints/deployments for Azure OpenAI embeddings #292

Closed samuelp-mw closed 2 months ago

samuelp-mw commented 2 months ago

Context

Currently, it is possible to use Azure both for chatbot and "chat with your data" use cases but this requires to adjust the environment variables locally to get it to work.

With the following set of environment variables (without OPENAI_API_KEY as we are using Azure-only), it is possible to perform a simple chat but the addition of data (in the assistant or in the chat itself) results in an error :

ANTHROPIC_API_KEY=placeholder
YDC_API_KEY=placeholder
TAVILY_API_KEY=placeholder
AZURE_OPENAI_DEPLOYMENT_NAME=placeholder
AZURE_OPENAI_API_KEY=placeholder
AZURE_OPENAI_API_BASE=placeholder
AZURE_OPENAI_API_VERSION=placeholder
CONNERY_RUNNER_URL=https://your-personal-connery-runner-url
CONNERY_RUNNER_API_KEY=placeholder
PROXY_URL=your_proxy_url
POSTGRES_PORT=placeholder
POSTGRES_DB=placeholder
POSTGRES_USER=placeholder
POSTGRES_PASSWORD=placeholder
SCARF_NO_ANALYTICS=true

The error occurs when ingesting data :

Traceback (most recent call last):
opengpts-backend   |   File "/usr/local/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
opengpts-backend   |     self.run()
opengpts-backend   |   File "/usr/local/lib/python3.11/multiprocessing/process.py", line 108, in run
opengpts-backend   |     self._target(*self._args, **self._kwargs)
opengpts-backend   |   File "/usr/local/lib/python3.11/site-packages/uvicorn/_subprocess.py", line 76, in subprocess_started
opengpts-backend   |     target(sockets=sockets)
opengpts-backend   |   File "/usr/local/lib/python3.11/site-packages/uvicorn/server.py", line 61, in run
opengpts-backend   |     return asyncio.run(self.serve(sockets=sockets))
opengpts-backend   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
opengpts-backend   |   File "/usr/local/lib/python3.11/asyncio/runners.py", line 190, in run
opengpts-backend   |     return runner.run(main)
opengpts-backend   |            ^^^^^^^^^^^^^^^^
opengpts-backend   |   File "/usr/local/lib/python3.11/asyncio/runners.py", line 118, in run
opengpts-backend   |     return self._loop.run_until_complete(task)
opengpts-backend   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
opengpts-backend   |   File "/usr/local/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
opengpts-backend   |     return future.result()
opengpts-backend   |            ^^^^^^^^^^^^^^^
opengpts-backend   |   File "/usr/local/lib/python3.11/site-packages/uvicorn/server.py", line 68, in serve
opengpts-backend   |     config.load()
opengpts-backend   |   File "/usr/local/lib/python3.11/site-packages/uvicorn/config.py", line 467, in load
opengpts-backend   |     self.loaded_app = import_from_string(self.app)
opengpts-backend   |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
opengpts-backend   |   File "/usr/local/lib/python3.11/site-packages/uvicorn/importer.py", line 21, in import_from_string
opengpts-backend   |     module = importlib.import_module(module_str)
opengpts-backend   |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
opengpts-backend   |   File "/usr/local/lib/python3.11/importlib/__init__.py", line 126, in import_module
opengpts-backend   |     return _bootstrap._gcd_import(name[level:], package, level)
opengpts-backend   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
opengpts-backend   |   File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
opengpts-backend   |   File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
opengpts-backend   |   File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
opengpts-backend   |   File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
opengpts-backend   |   File "<frozen importlib._bootstrap_external>", line 940, in exec_module
opengpts-backend   |   File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
opengpts-backend   |   File "/backend/app/server.py", line 9, in <module>
opengpts-backend   |     from app.api import router as api_router
opengpts-backend   |   File "/backend/app/api/__init__.py", line 3, in <module>
opengpts-backend   |     from app.api.assistants import router as assistants_router
opengpts-backend   |   File "/backend/app/api/assistants.py", line 7, in <module>
opengpts-backend   |     import app.storage as storage
opengpts-backend   |   File "/backend/app/storage.py", line 6, in <module>
opengpts-backend   |     from app.agent import AgentType, get_agent_executor
opengpts-backend   |   File "/backend/app/agent.py", line 24, in <module>
opengpts-backend   |     from app.tools import (
opengpts-backend   |   File "/backend/app/tools.py", line 27, in <module>
opengpts-backend   |     from app.upload import vstore
opengpts-backend   |   File "/backend/app/upload.py", line 138, in <module>
opengpts-backend   |     vstore = _determine_azure_or_openai_embeddings()
opengpts-backend   |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
opengpts-backend   |   File "/backend/app/upload.py", line 66, in _determine_azure_or_openai_embeddings
opengpts-backend   |     embedding_function=AzureOpenAIEmbeddings(),
opengpts-backend   |                        ^^^^^^^^^^^^^^^^^^^^^^^
opengpts-backend   |   File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
opengpts-backend   | pydantic.error_wrappers.ValidationError: 1 validation error for AzureOpenAIEmbeddings
opengpts-backend   | __root__
opengpts-backend   |   Must provide one of the `base_url` or `azure_endpoint` arguments, or the `AZURE_OPENAI_ENDPOINT` environment variable (type=value_error)

A similar problem is reported there (if I am not mistaken) :

Resolution

Unfortunately, AzureOpenAIEmbeddings expects AZURE_OPENAI_ENDPOINT to be set and we only have AZURE_OPENAI_BASE in the environment which causes the issues. Additionally, it might be useful to make the deployment name for embeddings configurable. Therefore the suggestion would be to read the needed variables and provide them as attributes to AzureOpenAIEmbeddings.

mkorpela commented 2 months ago

Thank you for the contribution!