brain-score / core

MIT License
1 stars 6 forks source link

Modified endpoint to separate getting models/benchmarks from scoring #40

Closed shehadak closed 10 months ago

shehadak commented 1 year ago

Description

This PR is motivated by the need to run individual scoring jobs for each model/benchmark pair on an HPC (e.g. Openmind) instead of running a single job to compute scores for all new models and benchmarks in a submission. We handle this by changing the structure of the scoring endpoint to separate getting the model/benchmark names functionality from the actual scoring (for example, to evaluate ALL_PUBLIC without necessarily scoring). As a result, the domain-specific plugin manager now has the flexibility to decide the best method for identifying and scoring the model/benchmark pairs.

Testing Strategy

This PR implements unit tests that separately test the functionality of both model/benchmark retrieval and scoring using dummy models and benchmarks.