stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
https://crfm.stanford.edu/helm
Apache License 2.0
1.76k stars 234 forks source link

Llama3 openai on azure #2761

Closed abhay-shete closed 2 weeks ago

abhay-shete commented 2 weeks ago

1) Added support for the Llama3 model and OpenAI gpt-4 model hosted on Azure 2) Modified instructions in lite-specs for multiple choice qa questions to only output a character based index for the answer 3) API keys and endpoints can be provided from credentials.conf instead of using environment variables Sample format is as follows:

{ azureLlama3ApiKey: "<azure llama 3 api key>" azureLlama3Endpoint: "<azure llama 3 endpoint>" azureLlama3Deployment: "" azureOpenAIApiKey: "" azureOpenAIEndpoint: "" azureOpenAIDeployment: "t" }

Note that the api keys, endpoints and deployment keys for a particular model should have the same prefix. These variables are then provided in the constructor of respective client classes

yifanmai commented 2 weeks ago

My overall feedback is that most of the contents of this pull request can live in its own Python package instead of being merged into the main branch. Here's how you would do this:

  1. Create a new package myproject, and run pip install -e . to install your package in editable mode.
  2. Move the following files to your package, preserving the subdirectory structure:
    • src/helm/benchmark/scenarios/*_scenario.py
    • src/helm/clients/azure_*_client.py
  3. Create a new src/helm/benchmark/run_specs/myproject_run_specs.py file in your package and your run spec functions from lite_run_specs.py to there. Revert the changes in lite_run_specs.py.
  4. Revert changes in auto_client.py.
  5. Move your version of schema_classic.yaml to your working directory and rename it to schema_myproject.yaml.
  6. Move model_metadata.yaml and model_deployments.yaml to the prod_env folder in your working directory.
  7. Move the deployment arguments to client_spec.args in model_deployments.yaml (example).

You should now be able to run helm-run with your scenarios and clients. You should run helm-summarize with the --schema-path schema_myproject.yaml flag.