delphi-suite / delphi

small language models training made easy
Apache License 2.0
8 stars 1 forks source link

Fix training data shifting bug #134

Closed jaidhyani closed 2 months ago

jaidhyani commented 2 months ago

~Currently the minimal viable fix. Further simplification is possible and desirable.~

Fixes and simplifies training data generation. There's no longer a separate label tensor, since *ForCausalLM models shift the labels internally (for some reason)