Closed NormXU closed 1 year ago
I would try print(model)
and find the model's path to the embedding layer.
replace
model.base_model.embed_tokens.forward = noised_embed(model.base_model.embed_tokens, noise_alpha)
with
model.base_model.embed_tokens.forward = noised_embed(model.base_model.embed_tokens.forward, noise_alpha)
This works for me. If this failed, you can try partial way instead of lambda function, which works for me too.
replace
model.base_model.embed_tokens.forward = noised_embed(model.base_model.embed_tokens, noise_alpha)
with
model.base_model.embed_tokens.forward = noised_embed(model.base_model.embed_tokens.forward, noise_alpha)
This works for me. If this failed, you can try partial way instead of lambda function, which works for me too.
I also encountered the same problem, your method is effective.
And I want to ask why it is like this? I have a guess, I don't know if it's correct: it's just that every time it runs to the
embed_init = orig_embed(x)
,
The forward function in the orig_embed has been replaced with the new_func
, so it will be called by __call_impl
, so in fact, this line of code has been continuously looping infinitely.
Environment
I try the patch in README
But
model.base_model.model.model.embed_tokens.forward
fails to work, showing the following error message:Therefore, I tried to edit the patch as below:
But this will cause another issue:
Could you give me some advice about it? Thank you very much.