Closed younes-io closed 11 months ago
Hi @Googleton : could you please help on this ? any ideas ?
Hello @younes-io
Could you provide a fuller example? For example we are missing things such as memory
or doc_retriever
to be able to run your code and check why the bug happens
Hi @younes-io, thanks for reporting this, I’m having a look!
@mattbit : here you go ! Thank you :)
import os
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chat_models.azure_openai import AzureChatOpenAI
from langchain.vectorstores.opensearch_vector_search import OpenSearchVectorSearch
from langchain.memory import ConversationTokenBufferMemory, ConversationBufferWindowMemory
from database import add_new_message, get_chat_history, get_user_by_email, rate_message, get_template_by_country, get_last_conversation
from handlers import fetch_hierarchy_and_values_for_country, build_opensearch_query, transform_payload_to_user_profile
def call_qa(user_profile):
print("calling call_qa..")
# Define Azure OpenAI component
llm = AzureChatOpenAI(
openai_api_key = openai_api_key,
openai_api_base = openai_api_base,
openai_api_type = openai_api_type,
openai_api_version = openai_api_version,
deployment_name = model_name,
temperature=0,
)
print("llm._default_params : ", llm._default_params)
import uuid
conversation_id = uuid.uuid4()
print("Conversation ID = " + str(conversation_id))
# Memory, the Postgres way
from langchain.memory import PostgresChatMessageHistory
history = PostgresChatMessageHistory(
connection_string=database_url,
session_id=str(conversation_id),
)
output_key = "answer"
input_key='question'
memory_key = "history"
memory = ConversationBufferWindowMemory(memory_key=memory_key, input_key=input_key, output_key=output_key, return_messages=True, chat_memory=history,k=2)
template = get_template_by_country(user_profile['Country'])
print("Country ==> ", user_profile['Country'])
from langchain.prompts import (
ChatPromptTemplate,
MessagesPlaceholder,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
)
prompt = ChatPromptTemplate(
messages=[
SystemMessagePromptTemplate.from_template(template),
MessagesPlaceholder(variable_name="history"),
HumanMessagePromptTemplate.from_template("{question}")
]
)
# Chain
from langchain.chains import RetrievalQAWithSourcesChain
chain_type_kwargs = {"prompt": prompt}
# Build a Retriever
embeddings = OpenAIEmbeddings(deployment=embedding_model, chunk_size=1)
docsearch = OpenSearchVectorSearch(
index_name=index_docs,
embedding_function=embeddings,
opensearch_url=opensearch_url,
http_auth=('user', auth)
)
client = docsearch.client
# Fetch hierarchy and values for USA
query = ## just a complex OpenSeach query run on documents
filter_kwargs = {'filter': query}
doc_retriever = docsearch.as_retriever(search_kwargs=filter_kwargs)
print("doc_retriever.search_kwargs == ", doc_retriever.search_kwargs)
qa = RetrievalQAWithSourcesChain.from_chain_type(
memory=memory,
llm=llm,
chain_type="stuff",
retriever=doc_retriever,
return_source_documents=True,
verbose=True,
chain_type_kwargs=chain_type_kwargs,
)
return qa
user_input = "May I drink alcohol in the office ?"
qa = call_qa({"Country": "UK"})
response = qa({"question": user_input})
Hello @younes-io
Which version of the openai api have you in your Python environment. The error you got is related to engine
which is deprecated: https://help.openai.com/en/articles/6283125-what-happened-to-engines
Could you try to upgrade the lib using pip install openai --upgrade
@kevinmessiaen :
1 - if the issue is with engine
, then why does the below work very well:
qa = call_qa({"Country": "UK"})
response = qa({"question": user_input})
the qa
chain executes successfully..
Besides, the error mentions both engine & deployment_id: Must provide an 'engine' or 'deployment_id' parameter to create a <class 'openai.api_resources.chat_completion.ChatCompletion'>
I want to debug this more but I don't know how does TestResult.execute()
work.. couldn't find its code..
2 - openai
version is:
@kevinmessiaen
Also, BTW, I'm using Azure OpenAI and I do provide a deployment_name at the beginning of the code
@younes-io
The issue actually comes from the fact that we are using openai API to run the tests and it is using your environment variables mixed with our setting (hence why it's complaining about not havinf any engine nor deployment_id): https://github.com/Giskard-AI/giskard/blob/6f72f9d7753c619dbbe48f5b88328f10ede35524/giskard/llm/client/openai.py#L104
We are running the evaluation of the generated answer (generated by your qa retrieval using Azure) through Openai: https://github.com/Giskard-AI/giskard/blob/6f72f9d7753c619dbbe48f5b88328f10ede35524/giskard/llm/evaluators/base.py#L87
A temporary fix would be for you to rename the environment variables in order to avoid conflicts
openai_api_base = os.environ['AZURE_OPENAI_API_BASE']
openai_api_key = os.environ['AZURE_OPENAI_API_KEY']
openai_api_type = os.environ['AZURE_OPENAI_API_TYPE']
openai_api_version = os.environ['AZURE_OPENAI_API_VERSION']
export OPENAI_API_KEY=sk-...
@kevinmessiaen : If I rename them, what do I use for what ? Does that mean I have to organize this in two sets: one set with AZUREOPENAI and another with OPENAI_ BTW, I only have Azure OpenAI creds, I don't have OpenAI creds.. It's still confusing, could you please clarify ?
@younes-io
Yes that's right for now you will have to organize that way. Here is how it's going on under the hood:
qa
using Azure
(or any model you provided)GPT-4
through OpenAI
(this is currently not customizable) Basically we are using GPT-4 to validate that the text generated by your Azure qa model pass the criteria that you provided. We do not provide option but it might be possible in the future.
You still can run other tests such as Prompt Injections that does not rely on evaluating answer though GPT-4
@kevinmessiaen : okay, that's clearer... the issue is that I don't have an OpenAI key :/ I have keys provided by Azure only
@kevinmessiaen : if I don't have an OpenAI key, does that mean I'm blocked and I'll need to drop Giskard, at least, for this usecase ?
@kevinmessiaen I did as you suggested and here's the outcome:
print("BEFORE openai.api_key ", openai.api_key)
openai.api_key = "*************************" ## api key provided by AzureOpenAI
print("model_name ", model_name)
print("openai_api_type ", openai.api_type)
print("openai.api_version ", openai.api_version)
print("AFTER openai.api_key ", openai.api_key)
res = my_test.execute()
print(res)
# res = my_test.
assert res.passed
assert res.metric == 0
assert res.output_df is None
I get : LLMConfigurationError: Could not authenticate with OpenAI API. Please make sure you have configured the API key by setting OPENAI_API_KEY in the environment.
BEFORE openai.api_key None
model_name custom-gpt-35-turbo
openai_api_type open_ai
openai.api_version None
AFTER openai.api_key xx**************************xxxxx
2023-11-16 17:51:56,643 pid:9640 MainThread openai INFO error_code=invalid_api_key error_message='Incorrect API key provided: ddddddddd******************ddddd. You can find your API key at https://platform.openai.com/account/api-keys.' error_param=None error_type=invalid_request_error message='OpenAI API error received' stream_error=False
**LLMConfigurationError: Could not authenticate with OpenAI API. Please make sure you have configured the API key by setting OPENAI_API_KEY in the environment.**
It will be very useful if you add this OpenAI coupling as a mention in your documentation. It will save a lot of time to those who cannot use OpenAI... for data privacy reasons, etc..
Unfortunately the scan is coupled with OpenAI and won't work without it. We are working on removing this coupling for future releases.
Thanks for pointing out that it's not clear enough in the documentation that we rely on OpenAI and that some data are send (generated output, model name and description as well as provided dataset), I'll update it.
Hello @younes-io
Per the following PR you will be able to run the scan using Azure OpenAI by setting the following environment variables:
export AZURE_OPENAI_API_KEY=AZURE_OPENAI_API_KEY
export AZURE_OPENAI_ENDPOINT=https://xxx.openai.azure.com
export OPENAI_API_VERSION=2023-07-01-preview
export GISKARD_SCAN_LLM_MODEL=my-gpt-4-model
The scan is still coupled of having to run a function calls capable model. It is advised to use GPT-4 even though it technically works on GPT-3.5.
You can preview the feature using pip install "giskard[llm]@git+https://github.com/Giskard-AI/giskard.git@feature/gsk-2177-add-a-way-to-support-azure-on-llm-scan"
hi @kevinmessiaen Thank you for the PR! Alright, I'll check that up
Issue Type
Bug
Source
source
Giskard Library Version
2.0.3
Giskard Hub Version
N/A
OS Platform and Distribution
No response
Python version
3.11
Installed python packages
No response
Current Behaviour?
Standalone code OR list down the steps to reproduce the issue
Relevant log output