JohnSnowLabs / langtest

Deliver safe & effective language models
http://langtest.org/
Apache License 2.0
470 stars 35 forks source link

Add support to the LLM eval class in Accuracy Category. #1053

Closed chakravarthik27 closed 3 weeks ago

chakravarthik27 commented 3 weeks ago
# config.yaml
model_parameters:
  max_tokens: 64
  task: text2text-generation
tests:
  defaults:
    min_pass_rate: 0.65
  robustness:
    add_typo:
      min_pass_rate: 0.7
  accuracy:
    llm_eval:
      hub: huggingface
      min_score: 1.0
      model: prometheus-eval/prometheus-7b-v2.0
      model_parameters:
        max_tokens: 64
        task: text-generation

harness setup:

h = Harness(
    task='question-answering',
    model={'model': 'google/flan-t5-base', 'hub': 'huggingface'},
    data={'data_source': 'MedMCQA'},
    config="config.yaml"
)