Azure-Samples / ai-rag-chat-evaluator

Tools for evaluation of RAG Chat Apps using Azure AI Evaluate SDK and OpenAI
MIT License
209 stars 75 forks source link

Computing gpt based metrics failed with the exception : 'charmap' codec can't encode characters #33

Closed KorbinianBraun4ntt closed 8 months ago

KorbinianBraun4ntt commented 8 months ago

This issue is for a:

- [x] bug report 
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. git checkout https://github.com/Azure-Samples/ai-rag-chat-evaluator/
  2. Run python3 -m scripts evaluate --config=example_config.json --numquestion=2

Log messages given by the failure

(INFO) azureml-metrics: [azureml-metrics] ActivityStarted: compute_metrics-qa, ActivityType: ComputeMetrics, CustomDimensions: {'app_name': 'azureml-metrics', 'task_type': 'qa', 'azureml_metrics_run_id': 'XXXXX', 'current_timestamp': 'XXXX'} (WARNING) azureml.metrics.text.qa.azureml_qa_metrics: LLM related metrics need llm_params to be computed. Computing metrics for ['gpt_groundedness', 'gpt_coherence', 'gpt_relevance'] (INFO) azureml.metrics.common._validation: QA metrics debug: {'y_test_length': 2, 'y_pred_length': 2, 'tokenizer_example_output': 'the quick brown fox jumped over the lazy dog', 'regexes_to_ignore': '', 'ignore_case': False, 'ignore_punctuation': False, 'ignore_numbers': False} 0%| | 0/2 [00:00<?, ?it/s] (WARNING) azureml.metrics.common.llm_connector._openai_connector: Computing gpt based metrics failed with the exception : 'charmap' codec can't encode characters in position 6-92: character maps to (ERROR) azureml.metrics.common._scoring: Scoring failed for QA metric gpt_groundedness (ERROR) azureml.metrics.common._scoring: Class: NameError Message: name 'NotFoundError' is not defined ....

Expected/desired behavior

No error int values for metrics "gpt_groundedness", "gpt_coherence" & "gpt_relevance"

OS and Version?

Windows 10

Versions

azureml-metrics[generative-ai]==0.0.43 azure-ai-generative==1.0.0b2 openai==0.28.1

Other Informations

Results in eval_results.jsonl: {"question":"...","answer":"...","context":"...","truth":"...","gpt_groundedness":null,"gpt_coherence":null,"gpt_relevance":null}

All relevant files are in UTF-8

pamelafox commented 8 months ago

I'll try to replicate this today or add more helpful errors. Was this with sample data?

pamelafox commented 8 months ago

Update: I didn't replicate but I did realize I'm not explicitly specifying an encoding of "utf-8" when I call open() in various places. I'll send a PR with that change as it may help, you could try that yourself as well.

pamelafox commented 8 months ago

I've now merged in my change to use encoding=utf-8 everywhere. Could you try that out and see if you're still seeing issues?

KorbinianBraun4ntt commented 8 months ago

Thank you very much for your quick help. It works now 🙂👍

pamelafox commented 8 months ago

Phew! I'll close this, thanks for raising it.