Problem:
When I use the local Llama-7b model for faithfulness calculation, the following error occurs. Is it because this metric does not support models with smaller parameters? It only supports closed-source LLMs, such as gpt3.5?
Traceback (most recent call last):
File "/python3.9/site-packages/deepeval/metrics/faithfulness/faithfulness.py", line 241, in _a_generate_truths
res: Truths = await self.model.a_generate(prompt, schema=Truths)
TypeError: a_generate() got an unexpected keyword argument 'schema'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/deepeval/metrics/utils.py", line 79, in trimAndLoadJson
return json.loads(jsonStr)
File "/python3.9/json/init.py", line 346, in loads
return _default_decoder.decode(s)
File "/python3.9/json/decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 7 column 1 (char 163)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 19, in
faith_score = faithfulness_score(generated_list, evidence_list, model_name_or_path, device)
File "faithfulness.py", line 51, in faithfulness_score
metric.measure(test_case)
File "/deepeval/metrics/faithfulness/faithfulness.py", line 59, in measure
loop.run_until_complete(
File "python3.9/asyncio/base_events.py", line 647, in run_until_complete
return future.result()
File "/deepeval/metrics/faithfulness/faithfulness.py", line 94, in a_measure
self.truths, self.claims = await asyncio.gather(
File "/deepeval/metrics/faithfulness/faithfulness.py", line 245, in _a_generate_truths
data = trimAndLoadJson(res, self)
File "/deepeval/metrics/utils.py", line 84, in trimAndLoadJson
raise ValueError(error_str)
ValueError: Evaluation LLM outputted an invalid JSON. Please use a better evaluation model.
Problem: When I use the local Llama-7b model for faithfulness calculation, the following error occurs. Is it because this metric does not support models with smaller parameters? It only supports closed-source LLMs, such as gpt3.5?
ERROR:
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████| 2/2 [00:03<00:00, 1.60s/it]
Traceback (most recent call last): File "/python3.9/site-packages/deepeval/metrics/faithfulness/faithfulness.py", line 241, in _a_generate_truths res: Truths = await self.model.a_generate(prompt, schema=Truths) TypeError: a_generate() got an unexpected keyword argument 'schema'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/deepeval/metrics/utils.py", line 79, in trimAndLoadJson return json.loads(jsonStr) File "/python3.9/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/python3.9/json/decoder.py", line 340, in decode raise JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError: Extra data: line 7 column 1 (char 163)
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "test.py", line 19, in
faith_score = faithfulness_score(generated_list, evidence_list, model_name_or_path, device)
File "faithfulness.py", line 51, in faithfulness_score
metric.measure(test_case)
File "/deepeval/metrics/faithfulness/faithfulness.py", line 59, in measure
loop.run_until_complete(
File "python3.9/asyncio/base_events.py", line 647, in run_until_complete
return future.result()
File "/deepeval/metrics/faithfulness/faithfulness.py", line 94, in a_measure
self.truths, self.claims = await asyncio.gather(
File "/deepeval/metrics/faithfulness/faithfulness.py", line 245, in _a_generate_truths
data = trimAndLoadJson(res, self)
File "/deepeval/metrics/utils.py", line 84, in trimAndLoadJson
raise ValueError(error_str)
ValueError: Evaluation LLM outputted an invalid JSON. Please use a better evaluation model.