LayoutLM.from_pretrained doesn't load embeddings' weights when using safetensors

mszulc913 commented 7 months ago

System Info

transformers version: 4.38.1
Platform: Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Python version: 3.10.13
Huggingface_hub version: 0.20.3
Safetensors version: 0.4.2
Accelerate version: not installed
Accelerate config: not found
PyTorch version (GPU?): 2.2.1+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed

Who can help?

No response

Information

[ ] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

Running the following:

from transformers import LayoutLMModel

model = LayoutLMModel.from_pretrained("microsoft/layoutlm-base-uncased", use_safetensors=True)

results in:

Some weights of LayoutLMModel were not initialized from the model checkpoint at microsoft/layoutlm-base-uncased and are newly initialized: ['layoutlm.embeddings.word_embeddings.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Note that this is also the default behavior if a user has safetensors installed and doesn't provide use_safetensors.

The following works as expected (without safetensors):

from transformers import LayoutLMModel

model = LayoutLMModel.from_pretrained("microsoft/layoutlm-base-uncased", use_safetensors=False)

Expected behavior

Embeddings' weights should be correctly loaded.

amyeroberts commented 6 months ago

Hi @mszulc913, thanks for opening this issue!

I'm able to replicate the issue.

The model checkpoint didn't have a safetensors weight associated with it. Which is merged in with this commit.

However, the issue still persists :(

It seems like this is an issue when loading as safetensors on the fly.

If instead I save out the model locally from pytorch.bin and save out as safetensors, I'm able to load without any issue:

from transformers import LayoutLMModel

model = LayoutLMModel.from_pretrained("microsoft/layoutlm-base-uncased", use_safetensors=False)

model.save_pretrained("test-layoutlm-base-uncased") # Saves out the model as safetensors

# Loads from safetensors automatically
model = LayoutLMModel.from_pretrained("test-layoutlm-base-uncased")

cc @LysandreJik @Narsil As you both probably have the best knowledge of this code
cc @Rocketknight1 as you've been looking into the safetensors conversion recently

RVV-karma commented 6 months ago

In SFconvertbot's convert.py file, the loading of weights happens with -

loaded = torch.load(pt_filename, map_location="cpu", weights_only=True)

, which does not maps the layers correctly (the 'keys' in the weights dictionary are different). This is causing the issue.

If we load the weights using -

    from transformers import LayoutLMModel
    model = LayoutLMModel.from_pretrained("microsoft/layoutlm-base-uncased", use_safetensors=False)
    loaded = {f"layoutlm.{k}":v.data for k, v in model.named_parameters()}

, then the weights are loaded with correct mappings.

This is either Pytorch's issue with load() function, or the implementation issue with SFconvertbot. If this issue needs to be fixed somewhere, I can take it up.

Also, I created a PR in microsoft/layoutlm-base-uncased with updated SafeTensors.

amyeroberts commented 6 months ago

@RVV-karma Thanks for looking into this and for fixing the weights upstream ❤️

@Rocketknight1 has been working with safetensor weight loading and the bot recently, so will be able to advise on the best approach here to address for future models.

Rocketknight1 commented 6 months ago

I actually haven't touched the bot, so I'm not sure how to push a fix to it! @Narsil do you know where it runs?

LysandreJik commented 4 months ago

The bot runs here @Rocketknight1 if you want to open a PR: https://huggingface.co/spaces/safetensors/convert

code is here https://huggingface.co/spaces/safetensors/convert/blob/main/convert.py

github-actions[bot] commented 3 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / transformers