regression loss tests with pythia / rand weights

yeah, you want it to be reasonably lightweight -- i'd say ~5 minutes on whatever test env you're using. there's cliffs of utility for tests -- <0.5s (in-editor, live), <5s (in-editor, triggered), <5min (on-commit, pre-merge), <5hr (overnight, pre-release).

you want the loss to go down "meaningfully" -- you can adjust the dataset and model size until you hit that mark. you're testing basic gradient flow and maybe some memorization (the simplest form of learning), so validation isn't important.

run the test a few times with different seeds to figure out thresholds on runtime (on specific hw) and loss value. the thresholds can be generously above, so the test isn't too noisy.

probably best to save checkpoints and then clean them up afterwards, so that the checkpointing & reloading logic gets exercised.

transmissions11 / bistro

regression loss tests with pythia / rand weights #11