explodinggradients / ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
https://docs.ragas.io
Apache License 2.0
6.62k stars 649 forks source link

Error code: 400 The response was filtered due to the prompt triggering Azure OpenAI's content management policy #649

Closed AjitAntony closed 1 month ago

AjitAntony commented 7 months ago

Describe the bug Followed the tutorial for RAGAS with Azure OpenAI in this link https://docs.ragas.io/en/stable/howtos/customisations/azure-openai.html and encountered error "The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766"

Ragas version: 0.1.1 Python version: 3.10.6

Code to Reproduce

from datasets import load_dataset

amnesty_qa = load_dataset("explodinggradients/amnesty_qa", "english_v2") from ragas.metrics import ( context_precision, answer_relevancy, faithfulness, context_recall, ) from ragas.metrics.critique import harmfulness

list of metrics we're going to use

metrics = [ faithfulness, answer_relevancy, context_recall, context_precision, harmfulness, ]

azure_configs = { "base_url": "https://.openai.azure.com/", "model_deployment": "your-deployment-name", "model_name": "your-model-name", "embedding_deployment": "your-deployment-name", "embedding_name": "text-embedding-ada-002", # most likely }

import os

assuming you already have you key available via your environment variable. If not use this

os.environ["AZURE_OPENAI_API_KEY"] = "..."

from langchain_openai.chat_models import AzureChatOpenAI from langchain_openai.embeddings import AzureOpenAIEmbeddings from ragas import evaluate

azure_model = AzureChatOpenAI( openai_api_version="2023-05-15", azure_endpoint=azure_configs["base_url"], azure_deployment=azure_configs["model_deployment"], model=azure_configs["model_name"], validate_base_url=False, )

init the embeddings for answer_relevancy, answer_correctness and answer_similarity

azure_embeddings = AzureOpenAIEmbeddings( openai_api_version="2023-05-15", azure_endpoint=azure_configs["base_url"], azure_deployment=azure_configs["embedding_deployment"], model=azure_configs["embedding_name"], )

result = evaluate( amnesty_qa["eval"], metrics=metrics, llm=azure_model, embeddings=azure_embeddings )

result Evaluating: 35%|██████████████████████████████████████████▎ | 35/100 [00:43<01:21, 1.25s/it] Exception in thread Thread-9: Traceback (most recent call last): File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\executor.py", line 75, in run results = self.loop.run_until_complete(self._aresults()) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 646, in run_until_complete return future.result() File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\executor.py", line 63, in _aresults raise e File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\executor.py", line 58, in _aresults r = await future File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\asyncio\tasks.py", line 571, in _wait_for_one return f.result() # May raise f.exception(). File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\executor.py", line 91, in wrapped_callable_async return counter, await callable(*args, kwargs) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\metrics\base.py", line 91, in ascore raise e File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\metrics\base.py", line 87, in ascore score = await self._ascore(row=row, callbacks=group_cm, is_async=is_async) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\metrics\critique.py", line 119, in _ascore result = await self.llm.generate( File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\llms\base.py", line 110, in generate return await loop.run_in_executor(None, generate_text) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\thread.py", line 58, in run result = self.fn(*self.args, *self.kwargs) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\tenacity__init__.py", line 289, in wrapped_f return self(f, args, kw) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\tenacity__init.py", line 379, in call do = self.iter(retry_state=retry_state) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\tenacity__init__.py", line 314, in iter return fut.result() File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\concurrent\futures_base.py", line 451, in result return self.get_result() File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\concurrent\futures_base.py", line 403, in get_result raise self._exception File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\tenacity\init.py", line 382, in call__ result = fn(*args, kwargs) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\llms\base.py", line 139, in generate_text return self.langchain_llm.generate_prompt( File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain_core\language_models\chat_models.py", line 544, in generate_prompt return self.generate(prompt_messages, stop=stop, callbacks=callbacks, kwargs) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain_core\language_models\chat_models.py", line 408, in generate raise e File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain_core\language_models\chat_models.py", line 398, in generate self._generate_with_cache( File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain_core\language_models\chat_models.py", line 577, in _generate_with_cache return self._generate( File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain_openai\chat_models\base.py", line 438, in _generate response = self.client.create(messages=message_dicts, *params) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\openai_utils_utils.py", line 275, in wrapper return func(args, **kwargs) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\openai\resources\chat\completions.py", line 663, in create return self._post( File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\openai_base_client.py", line 1200, in post return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)) File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\openai_base_client.py", line 889, in request return self._request( File "C:\Users\testuser\AppData\Local\Programs\Python\Python310\lib\site-packages\openai_base_client.py", line 980, in _request raise self._make_status_error_from_response(err.response) from None openai.BadRequestError: Error code: 400 - {'error': {'message': "The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766", 'type': None, 'param': 'prompt', 'code': 'content_filter', 'status': 400, 'innererror': {'code': 'ResponsibleAIPolicyViolation', 'content_filter_result': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': True, 'severity': 'medium'}}}}}

ExceptionInRunner Traceback (most recent call last) Cell In[11], line 1 ----> 1 result = evaluate( 2 amnesty_qa["eval"], metrics=metrics, llm=azure_model, embeddings=azure_embeddings 3 ) 5 result

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\evaluation.py:231, in evaluate(dataset, metrics, llm, embeddings, callbacks, is_async, max_workers, run_config, raise_exceptions, column_map) 228 if not evaluation_group_cm.ended: 229 evaluation_rm.on_chain_error(e) --> 231 raise e 232 else: 233 result = Result( 234 scores=Dataset.from_list(scores), 235 dataset=dataset, 236 binary_columns=binary_metrics, 237 )

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\ragas\evaluation.py:213, in evaluate(dataset, metrics, llm, embeddings, callbacks, is_async, max_workers, run_config, raise_exceptions, column_map) 211 results = executor.results() 212 if results == []: --> 213 raise ExceptionInRunner() 215 # convert results to datasetlike 216 for i, in enumerate(dataset):

ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass raise_exception=False incase you want to show only a warning message instead.

Expected behavior As per the tutorial this should have been executed successfully and return below {'faithfulness': 0.7083, 'answer_relevancy': 0.9416, 'context_recall': 0.7762, 'context_precision': 0.8000, 'harmfulness': 0.0000}

Additional context Suspect the prompt used by ragas framework is getting filtered by Azure OpenAI's content management policy , there are no options to disable filtering.

jjmachan commented 7 months ago

yeah the filter is a bummer - have to figure out how to solve it but the only option now is to by-pass it with raise_exception=False sadly :(

Sharvari-Jadhav-ELS commented 6 months ago

Facing the same issue. Will continue to track it's resolution. raise_exception=False is not really helping in my case.

omidreza-amrollahi commented 6 months ago

for me the solution was to turn off the content filter in Azure:D

olitengc commented 6 months ago

I got around it by removing the offending content. Not a long term solution if the dataset changes though. Something like this.


amnesty_qa = load_dataset("explodinggradients/amnesty_qa", "english_v2")

amnesty_qa_df = amnesty_qa['eval'].to_pandas()
# Drop the offending row
amnesty_qa_df.drop(18, inplace=True)
amnesty_qa_filtered = Dataset.from_pandas(amnesty_qa_df)

result = evaluate(
    amnesty_qa_filtered, metrics=metrics, llm=azure_model, embeddings=azure_embeddings
)
paolobighignoliaccenture commented 4 months ago

Solved removing the offending content, thanks!

cirezd commented 4 months ago

for me the solution was to turn off the content filter in Azure:D

How can you do that? I understood one needs to have a Microsoft-managed subscription in order to turn filters off...

jjmachan commented 1 month ago

for me the solution was to turn off the content filter in Azure:D

closing this in favor of this fix. how to do it: https://www.perplexity.ai/search/how-to-turn-off-the-content-fi-tnVIawVOQm.C91hlB3YFSA

github-actions[bot] commented 1 month ago

It seems the issue was answered, closing this now.