If not cost prohibitive, I'd like to test for build errors on every push (instead of every release) on the GitHub Actions side.
Stretch:
Eventually, I'd also like to write regression tests for all the tests in globals. Both simple screens for typos and grammatical errors (possibly via a small Anthropic model) and performance tests on small curated benchmark tasks.
I'd like to set up a simple, preferably automated way to run all the simple_tests tests in a Google Colab notebook (or at least a Jupyter notebook of some kind)
Top priorities:
Secondary priorities:
Stretch:
globals
. Both simple screens for typos and grammatical errors (possibly via a small Anthropic model) and performance tests on small curated benchmark tasks.