Open fahadh4ilyas opened 2 months ago
Apologies ye the fast RMS layernorm doesn't create any gradients for the RMS weights to speed things up
Apologies ye the fast RMS layernorm doesn't create any gradients for the RMS weights to speed things up
Oh, will it cause something unexpected? Or the accuracy of training will not be bothered by it?
Oh normal LoRA training does not train the RMS Layernorm weights
Oh normal LoRA training does not train the RMS Layernorm weights
what about full finetune? or maybe this repo is not supporting full finetune?
Currently not sorry
I'm testing the kernel to check the speed of each step. I'm comparing unsloth
fast_rms_layernorm
with openchat'srms_norm
. Here is my script:But, I got this error
This means that when doing backward, the weight from RMS Norm has no grad and would not be updated when training. Is this intentional?