UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
15.41k stars 2.49k forks source link

Having trouble loading a peft model #3052

Closed GTimothee closed 1 week ago

GTimothee commented 1 week ago

Hi, I am looking for some guidance as I think I am missing something. I trained an embedding model successfully, using peft as follows:

model._modules["0"].auto_model = get_peft_model(
    model._modules["0"].auto_model, peft_config
)

I also added some new tokens.

tokenizer.add_tokens(list(new_tokens))
model._modules['0'].auto_model.resize_token_embeddings(len(tokenizer))

Everything seems to be fine, the model trains and improves. The problem comes when saving and loading. If I just do "trainer.save_model()" I can't load the model back with SentenceTransformers(output_dir) for example, as it complains about the shape mistmatch between the embeddings of the checkpoint and the ones in the model being initialized. I don't understand why actually.

So I tried saving components separately:

model._modules['0'].auto_model.save_pretrained("output_dir/adapter")
model.tokenizer.save_pretrained("output_dir/tokenizer")

and then I reconstruct the model later on at test time as follows:

model = SentenceTransformer(
    model_id, device="cuda" if torch.cuda.is_available() else "cpu", trust_remote_code=True
)

model.tokenizer = AutoTokenizer.from_pretrained("output_dir/tokenizer")
model._modules['0'].auto_model.resize_token_embeddings(len(model.tokenizer))

peft_config = PeftConfig.from_pretrained("output_dir/adapter")
model._modules["0"].auto_model = get_peft_model(
    model._modules["0"].auto_model, peft_config
)

model._modules["0"].auto_model = PeftModel.from_pretrained(model._modules["0"].auto_model, 
    "output_dir/adapter", config=peft_config
)

It does not raise any error message, but when I try to encode the same sentence before saving and after loading it does not give the same result, and of course the evaluation on test set gives poor results.

Any idea or lead is welcome. I guess I forgot to load some configuration or something ?

pesuchin commented 1 week ago

Hello! @GTimothee Peft support was added in v3.3.0, so you may find that useful. The PEFT compatibility chapter has specific examples. https://github.com/UKPLab/sentence-transformers/releases/tag/v3.3.0

GTimothee commented 1 week ago

Thank you, I could not find good examples. I'll have a look 👍

tomaarsen commented 1 week ago

Apologies for the confusion surrounding PEFT. It was a late addition in v3.3.0 and we don't have documentation for it yet. Let me try and help you simplify some code:

model._modules["0"].auto_model = get_peft_model(
    model._modules["0"].auto_model, peft_config
)

can be:

model.add_adapter(peft_config)

and

model._modules["0"].auto_model = PeftModel.from_pretrained(model._modules["0"].auto_model, 
    "output_dir/adapter", config=peft_config
)

can be:

model.load_adapter("output_dir/adapter", config=peft_config)

If you reconstruct the model with test time via model._modules['0'].auto_model.resize_token_embeddings(len(model.tokenizer)), then the new embedding tokens will have random weights, so that should indeed explain why the results are different.

As for the real issue with PEFT & resizing, I'm able to reproduce it with this script:

from sentence_transformers import SentenceTransformer
from peft import LoraConfig, TaskType

test_new_tokens = ["[bla]", "[blub]", "[bloo]", "[blib]"]

model = SentenceTransformer("all-MiniLM-L6-v2")
model.tokenizer.add_tokens(test_new_tokens)
model[0].auto_model.resize_token_embeddings(len(model.tokenizer))

peft_config = LoraConfig(
    task_type=TaskType.FEATURE_EXTRACTION,
    inference_mode=False,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1,
)
model.add_adapter(peft_config)

embedding = model.encode("[bla] my name is [blub]")
print(embedding[:10])

model.save_pretrained("output_dir")

loaded_model = SentenceTransformer("output_dir")
embedding = loaded_model.encode("[bla] my name is [blub]")
print(embedding[:10])

I'll dive into this some more.

tomaarsen commented 1 week ago

Alright, I think I've figured it out. PEFT works by expanding on the base model slightly, but it doesn't expect there to be updated dimensions - it just adds some weights to the existing dimensions. But if you add tokens to the embedding layer for the adapter model, the base model must also be changed.

We can see that this holds up if we 1) first resize, then 2) save the resized model, then 3) use the resized model as the base model:

from sentence_transformers import SentenceTransformer
from peft import LoraConfig, TaskType

test_new_tokens = ["[bla]", "[blub]", "[bloo]", "[blib]"]

# Create a resized base model
model = SentenceTransformer("all-MiniLM-L6-v2")
model.tokenizer.add_tokens(test_new_tokens)
model[0].auto_model.resize_token_embeddings(len(model.tokenizer))
model.save_pretrained("all-MiniLM-L6-v2-resized")

# Load the resized model and add an adapter
model = SentenceTransformer("all-MiniLM-L6-v2-resized")
peft_config = LoraConfig(
    task_type=TaskType.FEATURE_EXTRACTION,
    inference_mode=False,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1,
)
model.add_adapter(peft_config)

embedding = model.encode("[bla] my name is [blub]")
print(embedding[:10])
'''
[-0.0439491   0.02856314 -0.02964916  0.03311558 -0.15700558 -0.03316737
  0.08325124 -0.05773272 -0.03219184  0.00531699]
'''

# Save the adapter itself
model.save_pretrained("all-MiniLM-L6-v2-adapter")

# Load the adapter directly
loaded_model = SentenceTransformer("all-MiniLM-L6-v2-adapter")
embedding = loaded_model.encode("[bla] my name is [blub]")
print(embedding[:10])
'''
[-0.0439491   0.02856314 -0.02964916  0.03311558 -0.15700558 -0.03316737
  0.08325124 -0.05773272 -0.03219184  0.00531699]
'''

This works because the loaded adapter takes all-MiniLM-L6-v2-resized as the base model in which it loads the adapter, rather than taking all-MiniLM-L6-v2 as the base model.

With other words, I don't think it's currently possible to resize a PEFT adapter and then load it into the base model. That would require a feature request on the PEFT project, I believe.

GTimothee commented 1 week ago

Ok it seems to work, thank you ! I can't access my proprietary data right now but I tried your solution in a colab notebook and I think it will work for my use case. Thank you very much ! 💯

GTimothee commented 1 week ago

Hi again,

First of all, I confirm that it works well so thank you again.

Apart from that, I just wanted to point out that this option "load_best_model_at_end=True" does not work any more with the Trainer when using peft. It complains about not finding a file in pth format if I remember well. I guess it is because we are checkpointing a peft model and not a full model. I wanted to pointed this out because if it is expected, maybe you would prefer to write a more customized error instead of filenotfound error, which can be misleading and end up as a ticket here or on stack overflow.

tomaarsen commented 1 week ago

Oh, good to know! That indeed sounds like a bug. Could you perhaps open a new issue for it? Ideally, you could also mention if it's only when you add new tokens.