Closed kairoswealth closed 1 week ago
same issue here :(
same here.
Hi @assafelovic - I can't get azure openai working either - are there any fixes for this?
here's the error in full
OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
File <command-4255151024986968>, line 1
----> 1 researcher = GPTResearcher(query, report_type)
2 research_result = await researcher.conduct_research()
3 report = await researcher.write_report()
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-ba48ef54-2a9d-4237-8474-ed916a3ab15c/lib/python3.11/site-packages/gpt_researcher/master/agent/master.py:63, in GPTResearcher.__init__(self, query, report_type, report_format, report_source, tone, source_urls, documents, vector_store, vector_store_filter, config_path, websocket, agent, role, parent_query, subtopics, visited_urls, verbose, context, headers, max_subtopics)
61 self.research_costs = 0.0
62 self.retrievers = get_retrievers(self.headers, self.cfg)
---> 63 self.memory = Memory(
64 getattr(self.cfg, 'embedding_provider', None), self.headers)
66 # Initialize components
67 self.research_conductor = ResearchConductor(self)
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-ba48ef54-2a9d-4237-8474-ed916a3ab15c/lib/python3.11/site-packages/gpt_researcher/memory/embeddings.py:35, in Memory.__init__(self, embedding_provider, headers, **kwargs)
9 _embeddings = None
10 headers = headers or {}
11 match embedding_provider:
12 case "ollama":
13 from langchain_community.embeddings import OllamaEmbeddings
14
15 _embeddings = OllamaEmbeddings(
16 model=os.environ["OLLAMA_EMBEDDING_MODEL"],
17 base_url=os.environ["OLLAMA_BASE_URL"],
18 )
19 case "custom":
20 from langchain_openai import OpenAIEmbeddings
21
22 _embeddings = OpenAIEmbeddings(
23 model=os.environ.get("OPENAI_EMBEDDING_MODEL", "custom"),
24 openai_api_key=headers.get(
25 "openai_api_key", os.environ.get("OPENAI_API_KEY", "custom")
26 ),
27 openai_api_base=os.environ.get(
28 "OPENAI_BASE_URL", "[http://localhost:1234/v1](https://adb-3064461014387646.6.azuredatabricks.net/editor/notebooks/%3Ca%20class=)" target="_blank" rel="noopener noreferrer">[http://localhost:1234/v1</a></span><span>"</span>](http://localhost:1234/v1%3C/a%3E%3C/span%3E%3Cspan%3E"%3C/span%3E)
29 ), # default for lmstudio
30 check_embedding_ctx_length=False,
31 ) # quick fix for lmstudio
32 case "openai":
33 from langchain_openai import OpenAIEmbeddings
34
---> 35 _embeddings = OpenAIEmbeddings(
36 openai_api_key=headers.get("openai_api_key")
37 or os.environ.get("OPENAI_API_KEY"),
38 model=OPENAI_EMBEDDING_MODEL
39 )
40 case "azure_openai":
41 from langchain_openai import AzureOpenAIEmbeddings
42
43 _embeddings = AzureOpenAIEmbeddings(
44 deployment=os.environ["AZURE_EMBEDDING_MODEL"], chunk_size=16
45 )
46 case "huggingface":
47 from langchain.embeddings import HuggingFaceEmbeddings
48
49 # Specifying the Hugging Face embedding model all-MiniLM-L6-v2
50 _embeddings = HuggingFaceEmbeddings(
51 model_name="sentence-transformers/all-MiniLM-L6-v2"
52 )
53
54 case _:
55 raise Exception("Embedding provider not found.")
57 self._embeddings = _embeddings
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-ba48ef54-2a9d-4237-8474-ed916a3ab15c/lib/python3.11/site-packages/pydantic/v1/main.py:339, in BaseModel.__init__(__pydantic_self__, **data)
333 """
334 Create a new model by parsing and validating input data from keyword arguments.
335
336 Raises ValidationError if the input data cannot be parsed to form a valid model.
337 """
338 # Uses something other than `self` the first arg to allow "self" as a settable attribute
--> 339 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
340 if validation_error:
341 raise validation_error
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-ba48ef54-2a9d-4237-8474-ed916a3ab15c/lib/python3.11/site-packages/pydantic/v1/main.py:1100, in validate_model(model, input_data, cls)
1098 continue
1099 try:
-> 1100 values = validator(cls_, values)
1101 except (ValueError, TypeError, AssertionError) as exc:
1102 errors.append(ErrorWrapper(exc, loc=ROOT_KEY))
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-ba48ef54-2a9d-4237-8474-ed916a3ab15c/lib/python3.11/site-packages/langchain_openai/embeddings/base.py:342, in OpenAIEmbeddings.validate_environment(cls, values)
340 values["http_client"] = httpx.Client(proxy=values["openai_proxy"])
341 sync_specific = {"http_client": values["http_client"]}
--> 342 values["client"] = openai.OpenAI(
343 **client_params, **sync_specific
344 ).embeddings
345 if not values.get("async_client"):
346 if values["openai_proxy"] and not values["http_async_client"]:
File /local_disk0/.ephemeral_nfs/envs/pythonEnv-ba48ef54-2a9d-4237-8474-ed916a3ab15c/lib/python3.11/site-packages/openai/_client.py:105, in OpenAI.__init__(self, api_key, organization, project, base_url, timeout, max_retries, default_headers, default_query, http_client, _strict_response_validation)
103 api_key = os.environ.get("OPENAI_API_KEY")
104 if api_key is None:
--> 105 raise OpenAIError(
106 "The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable"
107 )
108 self.api_key = api_key
110 if organization is None:
Have you tried setting ‘OPENAI_API_KEY’ env var as well? Unfortunately I don’t use Azure so we need the community to help with this. Can you also rasise this on discord?
Yes I tried and that doesn't help Error code: 401 - {'error': {'message': 'Incorrect API key provided: d8****xxxxx. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}
@danieldekay is our local Azure expert & left some docs here :)
Is it working for you on master Dan?
I can confirm it doesn't work (anymore).
From the error message it looks like it's trying to use OpenAI, and not AzureOpenAI -- so mut take a wrong turn somewhere.
I suspect it has something to do with the "new" way of specifying models and providers. However the deprecation warning doesn't show for me when running the webserver via
python -m uvicorn main:app --reload
@assafelovic, I don't have the time to debug this in detail.
we need to pass a deployment name, which could be anything (and not just the model name), and it could be different for fast and smart model, so we need a third parameter besides provider, model_name.
elif provider == "azure_openai":
_check_pkg("langchain_openai")
from langchain_openai import AzureChatOpenAI
llm = AzureChatOpenAI(
azure_deployment=os.environ["DEPLOYMENT_NAME_FOR_MODEL"], **kwargs
)
We could imagine a logic in -env with a logic of model_provide:model_name:deployment_name
FAST_LLM="azure_openai:gpt-4o:gpt-4o"
SMART_LLM="azure_openai:gpt-4o:gpt-4o"
I think your issue relates using deprecated env vars. Check this updated doc: https://docs.gptr.dev/docs/gpt-researcher/llms/llms
@assafelovic - yes, and no. Azure requires deployment AND model to work, and deployment is not implicitely taken from an environmental variable, like the API key.
So it must be provided by the generic model factory.
even when using FAST_LLM instead of FAST_LLM_MODEL the issue remains the same. It is expecting only an OpenAI API key and does not take the Azure OpenAI route anymore
@danieldekay I think the FAST_LLM parameter is the deployment name But in any case, it fails earlier in the logic to route into the Azure logic
Thanks @kairoswealth my best suggestion for now is to try and rollback to a version that worked for you until we get it sorted. @danieldekay any chance you can help take a look? I don't use Azure
@assafelovic , I am on it, but need some help. Azure needs an API key, and a model name, but also a model deployment for each model. These must be given to AzureChatOpenAI.
And since we are differentiating between smart and fast models, we need two different deployments.
However, the decision, when a model is called as "smart" or "fast" is done in the report generation, and not in the model factory. I see that llm_kwargs are never set in the config, at all. Did I just not find it, or is it not used, anywhere?
@danieldekay for now only the smart llm is used. We changed the summarization logic
tested with the current master branch and #918 - you can close this issue.
FAST_LLM="azure_openai:gpt-4o-mini"
SMART_LLM="azure_openai:gpt-4o"
EMBEDDING="azure_openai:text-embedding-3-large"
Rockstar @danieldekay 👏
can you show the whole .env @danieldekay , cause I still get the same error on my side
EMBEDDING_PROVIDER=azure_openai
LLM_PROVIDER = azure_openai
AZURE_OPENAI_API_KEY=xxxxxxx
AZURE_OPENAI_ENDPOINT=https://xxxxxxx.openai.azure.com/
OPENAI_API_VERSION=2023-03-15-preview
FAST_LLM=azure_openai:gpt-4o_mini
SMART_LLM=azure_openai:gpt-4o
AZURE_EMBEDDING_MODEL=azure_openai:text-embedding-3-large
@kairoswealth - can you post the error, and a stack trace?
@HusainMkd - your error is related to the embeddings model: did you check with the current master? Embeddings now also follow the new syntax, see comment above.
Can someone please clarify how this is supposed to work? These tickets for Azure are closed, but I see conflicting remarks from the comments and what is in the official documentation.
The official docs, which were edited with this PR
State to use the env var EMBEDDING
Though you all mention AZURE_EMBEDDING_MODEL
There is no mention of where the OPENAI_API_VERSION
is coming from.
The official docs
also state that FAST_LLM
and SMART_LLM
are started with 'openai' opposed to what comments are saying to use 'azure_openai'
Thanks for all the work, and double thanks for any clarification.
After some trial and error. I did get it to work with Azure. I used the following ENV vars. Please note, I would get an error and ultimately failure if I did not use both OPENAI_API_VERSION
and AZURE_OPENAI_API_VERSION
Also, my API version that works is different, which is understandable. This is an Azure thing, and does not have to do with gpt-researcher
EMBEDDING="azure_openai:text-embedding-3-small"
AZURE_OPENAI_API_KEY=
AZURE_OPENAI_ENDPOINT=
OPENAI_API_VERSION=2024-02-15-preview
AZURE_OPENAI_API_VERSION=2024-02-15-preview
FAST_LLM=azure_openai:gpt-4o-mini # note that the deployment name must be the same as the model name
SMART_LLM=azure_openai:gpt-4o # note that the deployment name must be the same as the model name
Looks like https://github.com/assafelovic/gpt-researcher/blob/eaa3217303197e50230b17a354bd3727c8a3dfd5/gpt_researcher/memory/embeddings.py#L50 uses the AZURE_OPENAI_API_VERSION explicitely, whereas langchain uses OPENAI_API_VERSION (https://python.langchain.com/docs/integrations/llms/azure_openai/).
Proposal: change the gpt-researcher code to also use OPENAI_API_VERSION for consistency.
Describe the bug When trying to use Azure Openai by setting the .env file, I get the following issue:
"Did not find openai_api_key, please add an environment variable
OPENAI_API_KEY
which contains it, or passopenai_api_key
as a named parameter. (type=value_error)To Reproduce clone repo create .env: EMBEDDING_PROVIDER=azure_openai LLM_PROVIDER = azure_openai AZURE_OPENAI_API_KEY=xxxxxxx AZURE_OPENAI_ENDPOINT=https://xxxxxxx.openai.azure.com/ OPENAI_API_VERSION=2023-03-15-preview FAST_LLM_MODEL=gpt-4o SMART_LLM_MODEL=gpt-4o AZURE_EMBEDDING_MODEL=text-embedding-3-large TAVILY_API_KEY=xxxxxx
Expected behavior No error
Desktop (please complete the following information):
EDIT: also tried with FAST_LLM and SMART_LLM env variable names, same result