Open mxjmtxrm opened 5 months ago
Hi, great work! I met some problems during 4bit weight-only quantization(--lwc).
I quantized a llama model using different lwc hyper-parameters and received different results.
I found NaN norm during training, too. I guess it is caused by AMP training.
Hi, great work! I met some problems during 4bit weight-only quantization(--lwc).
I quantized a llama model using different lwc hyper-parameters and received different results.