Open JingsenZhang opened 3 months ago
Hello, I've been experiencing the same issue, except I'm using an Azure OpenAI instance.
Have you had any luck in solving this?
I am also getting the same error with Azure OpenAI.
Can someone help me to get rid of this error.?
Can better support for Azure Open AI be added please?
Problem: When I use the local Llama-7b model for faithfulness calculation, the following error occurs. Is it because this metric does not support models with smaller parameters? It only supports closed-source LLMs, such as gpt3.5?
ERROR:
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████| 2/2 [00:03<00:00, 1.60s/it]
Traceback (most recent call last): File "/python3.9/site-packages/deepeval/metrics/faithfulness/faithfulness.py", line 241, in _a_generate_truths res: Truths = await self.model.a_generate(prompt, schema=Truths) TypeError: a_generate() got an unexpected keyword argument 'schema'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/deepeval/metrics/utils.py", line 79, in trimAndLoadJson return json.loads(jsonStr) File "/python3.9/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/python3.9/json/decoder.py", line 340, in decode raise JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError: Extra data: line 7 column 1 (char 163)
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "test.py", line 19, in
faith_score = faithfulness_score(generated_list, evidence_list, model_name_or_path, device)
File "faithfulness.py", line 51, in faithfulness_score
metric.measure(test_case)
File "/deepeval/metrics/faithfulness/faithfulness.py", line 59, in measure
loop.run_until_complete(
File "python3.9/asyncio/base_events.py", line 647, in run_until_complete
return future.result()
File "/deepeval/metrics/faithfulness/faithfulness.py", line 94, in a_measure
self.truths, self.claims = await asyncio.gather(
File "/deepeval/metrics/faithfulness/faithfulness.py", line 245, in _a_generate_truths
data = trimAndLoadJson(res, self)
File "/deepeval/metrics/utils.py", line 84, in trimAndLoadJson
raise ValueError(error_str)
ValueError: Evaluation LLM outputted an invalid JSON. Please use a better evaluation model.