Closed KaneGreen closed 2 weeks ago
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
View this failed invocation of the CLA check for more information.
For the most up to date status, view the checks section at the bottom of the pull request.
@pengchongjin @michaelmoynihan Any idea on this PR?
Thanks, @KaneGreen ! Coould you please sign the CLA in order to pass the pre-check?
@pengchongjin I've signed that. But it doesn't update. Any way to re-run this check?
@pengchongjin CLA has been signed
Trying to fix #51 And this also increase the speed of loading weights. (in my computer, about 1min vs 2min) Tested on
1.1-7b-it
and7b-it
model.but:
quant
model.requires_grad
ofnn.Parameter
inLinear
andEmbedding
toTrue
after the loading is completed. (I don't know why some nn.Parameters in model.py haverequires_grad
as False and others as default True) But I think, this shouldn't affect sinceforward
function ofGemmaForCausalLM
has@torch.no_grad()
.