Added the following benchmark CRUD functions in the client:
get_benchmark()
upload_benchmark()
update_benchmark()
delete_benchmark()
Added tests to all functions.
Modified config_gaokao.json under example/benchmark/gaokao and used it as a test artifact. This kind of defeats the purpose of having a tests/artifacts folder, but I think it improves consistency as it forces us to update the example if the schema is changed in the future.
Next steps
Documentation on the expected benchmark JSON schema.
Support the same set of functions in the CLI.
Update evaluate_benchmark.py in the CLI and README under the example/benchmark/gaokao on submitting a system to a benchmark.
Features
get_benchmark()
upload_benchmark()
update_benchmark()
delete_benchmark()
config_gaokao.json
underexample/benchmark/gaokao
and used it as a test artifact. This kind of defeats the purpose of having atests/artifacts
folder, but I think it improves consistency as it forces us to update the example if the schema is changed in the future.Next steps
evaluate_benchmark.py
in the CLI and README under theexample/benchmark/gaokao
on submitting a system to a benchmark.