Open MiladInk opened 1 month ago
cc @SunMarc
Hi @MiladInk, thanks for the report. Could you share a minimal reproducer ? When we load a model with shared weights, we make sure to tie the shared weights together.
I am facing a similar issue when trying to load and save “google/gemma-2b”
Hi @raghavgarg97, could you share a minimal reproducer ?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
Information
Tasks
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
)Reproduction
I am saving the state_dict of an 'facebook/opt-125m' model. In this model the weights are shared between embedding tokens and the language modelling head. When I am saving the state dictionary of the model, I see this warning:
The problem is when I want to
load_state
the same object, there is an error that:I do understand that because the weights are shared they are removed, but I don't understand how can I work with models which have shared weights then?
Interesting thing is, the code was working with the previous versions of the libraries. Unfortunately, I don't have the old environment to tell you exactly where things break.
Thanks in advance.
Expected behavior
I expected the
save_state
andload_state
to be able to restore the original model no matter what. This does not work.