johnsmith0031 / alpaca_lora_4bit

MIT License
533 stars 84 forks source link

Zero initializer for biases #126

Closed alex4321 closed 1 year ago

alex4321 commented 1 year ago

As I mentioned in https://github.com/johnsmith0031/alpaca_lora_4bit/issues/124 I used this library to load vicuna models and at some point I started getting Inf/NaN results during inferencing these 2 models:

After diving into the issue I realized it has no bias weights, so default initializer (as you mentioned correctly @johnsmith0031 ) was not overridden by loading weights.

So I replaced default initializer to zeros.

(p.s. still remains a mistery for me why so different behavior on different platforms - it should initialize biases with some garbage from memory, and I expected this garbage to have similar chance to go inf after converting from float32 to float16, but screw it)

johnsmith0031 commented 1 year ago

Thanks for debugging and fixing!