Goal: Allow to export and update a JSON configuration file to run an evaluation. We want to automatically see which models/repositories/tasks are new/gone. This allows us to commit the configuration we are using for a full evaluation for a eval version into the repository.
We want to store:
available models for providers
selected models for providers
available repositories (with their tasks)
selected repositories (with their tasks)
We want to load (?):
selected models for providers
selected repositories
TODO
Iteration 1
[x] Return all available models by querying the provider's APIs
Locally available Ollama models don't say anything about which models are generally available, only which ones are locally available, so ignore Ollama for now (until #283 is in).
Iteration 2
[x] Store the available models in JSON file
[x] Store the selected models in JSON file
Iteration 3
[x] Store the available repositories (with tasks) in JSON file
[x] Store the selected repositories (with tasks) in JSON file
Iteration 4
[x] Handle JSON file as configuration argument to the evaluation
Goal: Allow to export and update a JSON configuration file to run an evaluation. We want to automatically see which models/repositories/tasks are new/gone. This allows us to commit the configuration we are using for a full evaluation for a eval version into the repository.
We want to store:
We want to load (?):
TODO
Iteration 1
Provider.Models()
https://openrouter.ai/api/v1/models
~ already implementedhttp://127.0.0.1:11434/api/tags
~ already implementedIteration 2
Iteration 3
Iteration 4
Iteration 5
Also store and load custom provider urls so that they don't need to be carried over manuallyFollow-up: https://github.com/symflower/eval-dev-quality/issues/307