tdrussell / qlora-pipe

A pipeline parallel training script for LLMs.
MIT License
83 stars 8 forks source link

bug-fix: use right varname in lm_head #15

Closed kallewoof closed 4 months ago

kallewoof commented 4 months ago

I'm pretty sure orig is lm_head here, as indicated in the comment above.

I crashed on this when I was playing around with model_type other than the default. Using lm_head seems to do the trick.