GEM-benchmark / GEM-metrics

Automatic metrics for GEM tasks
https://gem-benchmark.com
MIT License
60 stars 20 forks source link

Adding ability to run on-demand unit tests #81

Closed danieldeutsch closed 2 years ago

danieldeutsch commented 2 years ago

This PR adds

This will allow us to run Docker metrics with Github Actions as necessary on the CPU. The Docker metrics would be expensive to run every time, so with this PR we could manually run the unit tests for a new Docker metric when its PR is submitted.

I am not actually sure how to test to make sure the new Action is working. I do not see it available for running on-demand on the Actions page. I don't know if this is because it has to be in the main branch first, if I don't have the required permissions, or if there's some mistake in the config. I vaguely remember not being able to see the Action until it was merged when I implemented this in Repro.

tuetschek commented 2 years ago

LGTM. Would looking at CUDA_AVAILABLE_DEVICES make sense for choosing the device, or does it not make sense in the context of Github?

danieldeutsch commented 2 years ago

We could do that, too. For the context of GitHub it does not matter since it will always use the CPU.

I'll merge this for now and we think it makes more sense to use CUDA_VISIBLE_DEVICES then I can make the change.

danieldeutsch commented 2 years ago

FYI It works as intended. Here are examples of running the BLEURT (https://github.com/GEM-benchmark/GEM-metrics/actions/runs/1814247777) and Prism (https://github.com/GEM-benchmark/GEM-metrics/actions/runs/1814254012) test cases with Github actions

tuetschek commented 2 years ago

Awesome, thanks 😊!