Computing the Evaluation Metrics error: "BadRequestError(Unsupported data type)"

Azure-Samples / azure-openai-raft

6 stars 5 forks source link

Computing the Evaluation Metrics error: "BadRequestError(Unsupported data type)" #2

Open ahotrod opened 1 week ago

ahotrod commented 1 week ago

Awesome Azure OpenAI RAFT sample, thanks for sharing!

In notebook _3_raft_evaluation.ipynb, section 5. Computing the evaluation metrics for both models_ I get multiples of the following error message:

 **_Exception raised in Job[150]: BadRequestError(Unsupported data type)_**

Initial search suggested it may have to do with using an old embedding API, i.e. the Azure OpenAI text-embedding-ada-002 I was using. I upgraded to the more recent Azure OpenAI text-embedding-3-large and received the same error messages.

I'll keep searching for a solution. Any ideas of a fix for this issue?

Thanks again, Dennis

ahotrod commented 3 days ago

Received the following warning in notebook _3_raftevaluation.ipynb, cells #14 & #17:

SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

Warning was removed with the following code in cell #14:

baseline_df = test_df.copy(deep = True)

baseline_df = baseline_df[['baseline_final_answer',
                           'context',
                           'gold_final_answer',
                           'question']]

# ---- Was originally ----:
baseline_df = test_df[['baseline_final_answer',
                       'context',
                       'gold_final_answer',
                       'question']]

_finetuneddf in cell #17 was changed similarly.

The issue BadRequestError(Unsupported data type) noted above remains.

ragas evaluate:

baseline_result = evaluate(dataset, metrics=metrics, llm=azure_model, embeddings=azure_embeddings)

returns:

{'faithfulness': 0.8244, 'answer_relevancy': nan, 'answer_similarity': nan, 'answer_correctness': nan}

ahotrod commented 1 day ago

The issue BadRequestError(Unsupported data type) is fixed by changing the _azureembeddings definition:

Was:

azure_embeddings = AzureOpenAIEmbeddings(
    openai_api_version = "2024-02-01",
    azure_endpoint     = judge_model_endpoint,
    azure_deployment   = embedding_model_deployment,
    api_key            = judge_model_api_key)

Changed to:

azure_embeddings = AzureOpenAIEmbeddings(
    openai_api_version = "2024-08-01-preview",
    azure_endpoint     = embedding_model_endpoint,
    azure_deployment   = embedding_model_deployment,
    openai_api_key     = embedding_model_api_key,
    validate_base_url  = False)

With Azure's default hyperparameters (none initially specified, as suggested), _baselineresult: {'faithfulness': 0.7465, 'answer_correctness': 0.5884, 'answer_relevancy': 0.8760, 'answer_similarity': 0.8944} _ftresult: {'faithfulness': 0.7111, 'answer_correctness': 0.7436, 'answer_relevancy': 0.9236, 'answer_similarity': 0.9467}

ahotrod commented 13 hours ago

No hyperparameters specified, Azure picked defaults:

Results_default