Closed epwalsh closed 3 months ago
Can we add a test for this being correct?
Can we add a test for this being correct?
@dirkgr we've been YOLO-ing the trainer so far with 0 test coverage. This is bad, but setting up the boilerplate for proper trainer tests is a project in and of itself. My vision is that eventually we move and improve the trainer to OLMo-core, at which point we could add thorough tests.
Hm, fair enough, maybe a bigger discussion. I'd rather merge OLMo-core and OLMo, and separately, have more things be functional rather than object oriented.
Because of the objects everywhere, I had to copy and paste so many things in https://github.com/allenai/OLMo/blob/FindHighGnormInstances/scripts/find_high_gnorm_instances.py. But a lot of that stuff doesn't need to be in an object.
And if it's functional, then it can be tested in isolation.
But that's not in scope for this fix.
Can we add a test for this being correct?
@dirkgr we've been YOLO-ing the trainer so far with 0 test coverage. This is bad, but setting up the boilerplate for proper trainer tests is a project in and of itself. My vision is that eventually we move and improve the trainer to OLMo-core, at which point we could add thorough tests.
Based on my experience with the code bugs and the lack of tests lately, I would rather we add tests now as and when we encounter things, instead of waiting to find time to write perfect boilerplate for tests. I think the goal should be to have a place where we at least record the things that require testing, that have failed accidentally without tests. We can always go back and refactor with better testing code.
@2015aroras uncovered a recent bug with how we're calculating Z-loss. It should be average over tokens, not instances. 714647320afa0e226106c25be75c576769cd232c introduced this bug.
See https://github.com/mlfoundations/open_lm/blob/main/open_lm/losses.py for a reference implementation.