clessig / atmorep

AtmoRep model code
MIT License
40 stars 9 forks source link

Misspeled embedding file name #12

Closed kacpnowak closed 1 week ago

kacpnowak commented 7 months ago

In the atmorep_model.py line 284 embedding files are loaded under the name "_embed_token_info", however in line 455 are saved as "_embeds_token_info" causing a crash when multiformers are used

kacpnowak commented 6 months ago

I've also encoutered error with keys not matching: RuntimeError: Error(s) in loading state_dict for Linear: 3: Missing key(s) in state_dict: "weight", "bias". 3: Unexpected key(s) in state_dict: "0.weight", "0.bias".

To solve it one can add "strict=False" in line 286

clessig commented 6 months ago

Ok, we don't see the issue with multiformers not loading. Which branch are you on?

@iluise : have you seen a problem with multiformers?

iluise commented 1 week ago

Hi this should have been fixed in this PR: https://github.com/clessig/atmorep/pull/37. Closing the issue now. please let me know if you still have problems.