Open erzaliator opened 1 year ago
It's possible to save the entire model using the transformers library. However, when loading the model for inference, the default configuration does not include additional parameters (adapters). As a result, the model will only perform inference on the non-adapter parameters. I haven't tried it myself but you should attempt to inference the model while specifying a custom config and see if that works out.
@Celestinian , on the contrary, from the library behavior, the adapter library saves only the adapter modules and heads but not bert weights. As a result, the model attempts to load only adapter weights. I suspect the trainer.state variable needs to be corrected to store both adapter and bert weights.
@erzaliator Thank you for bringing this out. I'm sorry to hear that you're experiencing the same frustrating issue. However, when I attempted to use your suggested callback method, I encountered the following error:
AttributeError: 'str' object has no attribute 'evaluate'
@calpt do you have any suggestion?
@Ch-rode can you check if the value of Trainer is correct? It should not be initialized as a string object.
Additionally, these are the model details (complete code is here . It does use some custom python files as imports but maybe it could be helpful as reference. With my setup this code is able to save the model including the adapter modules and heads):
from transformers import AutoConfig, AutoAdapterModel
from transformers import AdapterConfig
lang1 = 'en'
dataset1 = 'eng.rst'
lang2 = 'de'
dataset2 = 'deu.rst'
config = AutoConfig.from_pretrained(
BERT_MODEL,
)
model = AutoAdapterModel.from_pretrained(
BERT_MODEL,
config=config,
)
# Load the language adapter
lang_adapter_config = AdapterConfig.load("pfeiffer", reduction_factor=2)
model.load_adapter(lang1+"/wiki@ukp", config=lang_adapter_config)
# Add a new task adapter
model.add_adapter("disrpt")
# Add a classification head for our target task
num_labels = len(set(labels1.names))
head_name = "disrpt-"+dataset1.replace('.', '-')
print('Total prediction labels: ', num_labels)
model.add_classification_head(head_name, num_labels=num_labels)
# set trainable adapter
model.train_adapter(["disrpt"])
# Unfreeze and activate stack setup
lang = lang1
model.active_adapters = Stack(lang, "disrpt")
model.active_head = head_name
lang = dataset1
Additional info: adapter-transformers==3.2.1 transformers==4.20.1 torch==1.12.1+cu102
@erzaliator I'm missing a piece, are you saying it is possible to save a model with an adapter to be later loaded for inference without using the adapter library?
🚀 Feature request
Request for AdapterTrainer to support saving complete model
Motivation
AdapterTrainer's function _save() supports saving adapter modules and heads but does not support saving the pretrained model.
I encountered this issue while using the
--load_best_model_at_end=True
in the training arguments. I get a warningCould not locate the best model at runs/my_current_model/checkpoint-70/pytorch_model.bin, if you are running a distributed training on multiple nodes, you should activate --save_on_each_node.
This is due default AdapterTrainer only saving the adapters and heads and not the rest of the model weights.
So am I missing some arguments that can explicitly save a pytorch_model.bin file? Otherwise, I believe it would be nice to have an feature to save either adapter modules or the complete model in AdapterTrainer.
I am using Pfeiffer's MAD-X task and language adapters.
Your contribution
As of now, I am manually saving the best model using a callback to use
model.save_pretrained()
as follows (subsequently viaload_pretrained
best model is loaded manually):`