[x] Test if the LLM eval creation and running works in the UI.
[x] Test if the CLI version works.
[x] Describe how to setup ollama in README.
Next TODO:
[ ] make the CLI commands fail on the first error - make it behave like regular CLI (disable the flask context)
[ ] Create and well describe an issue for adding an end-to-end test mocking the OLLAMA webserver.
The CLI usage
factgenie run-llm-eval --help
2024-06-29 14:05:28 INFO Application ready
Usage: factgenie run-llm-eval [OPTIONS]
Runs the LLM evaluation from CLI wit no web server.
Options:
--campaign_name TEXT [required]
--dataset_name TEXT [required]
--split TEXT [required]
--llm_output_name TEXT [required]
--llm_metric_config TEXT Path to the metric config file or just the metric
name. [required]
--help Show this message and exit.
Next TODO:
The CLI usage
Example