johnsmith0031 / alpaca_lora_4bit

MIT License
533 stars 84 forks source link

Fix NaN or Inf after initializing Vicuna models (due to lack of bias weights) #125

Closed alex4321 closed 1 year ago

alex4321 commented 1 year ago

As I mentioned in https://github.com/johnsmith0031/alpaca_lora_4bit/issues/124 I used this library to load vicuna models and at some point I started getting Inf/NaN results during inferencing these 2 models:

After diving into the issue I realized it has no bias weights, so default initializer (as you mentioned correctly @johnsmith0031 ) was not overriden by loading weights.

So I replaced default initializer to zeros.

(p.s. still remains mistery for me why so different behaviour on different platforms - it should initialize biases with some garbage from memory, and I expected this garbage to have similar chance to go inf after converting from float32 to float16, but screw it)

alex4321 commented 1 year ago

UPD. Wrong branch