Cli api llm eval - Githubissues

[x] Refactoring done, skeleton written.
[x] Test if the LLM eval creation and running works in the UI.
[x] Test if the CLI version works.
[x] Describe how to setup ollama in README.

Next TODO:

[ ] make the CLI commands fail on the first error - make it behave like regular CLI (disable the flask context)
[ ] Create and well describe an issue for adding an end-to-end test mocking the OLLAMA webserver.

The CLI usage

factgenie run-llm-eval --help
2024-06-29 14:05:28 INFO Application ready
Usage: factgenie run-llm-eval [OPTIONS]

  Runs the LLM evaluation from CLI wit no web server.

Options:
  --campaign_name TEXT      [required]
  --dataset_name TEXT       [required]
  --split TEXT              [required]
  --llm_output_name TEXT    [required]
  --llm_metric_config TEXT  Path to the metric config file or just the metric
                            name.  [required]
  --help                    Show this message and exit.

Example

factgenie run-llm-eval --campaign_name testing-cli-ice_hockey --dataset_name ice_hockey --split dev --llm_output_name zephyr --llm_metric_config factgenie/llm-eval/ollama-llama3.yaml

kasnerz / factgenie

Cli api llm eval #34