adapter-hub / adapters

A Unified Library for Parameter-Efficient and Modular Transfer Learning
https://docs.adapterhub.ml
Apache License 2.0
2.57k stars 346 forks source link

Request for AdapterTrainer to support saving entire model #531

Open erzaliator opened 1 year ago

erzaliator commented 1 year ago

🚀 Feature request

Request for AdapterTrainer to support saving complete model

Motivation

AdapterTrainer's function _save() supports saving adapter modules and heads but does not support saving the pretrained model.

I encountered this issue while using the --load_best_model_at_end=True in the training arguments. I get a warning Could not locate the best model at runs/my_current_model/checkpoint-70/pytorch_model.bin, if you are running a distributed training on multiple nodes, you should activate --save_on_each_node.

This is due default AdapterTrainer only saving the adapters and heads and not the rest of the model weights.

So am I missing some arguments that can explicitly save a pytorch_model.bin file? Otherwise, I believe it would be nice to have an feature to save either adapter modules or the complete model in AdapterTrainer.

I am using Pfeiffer's MAD-X task and language adapters.

Your contribution

As of now, I am manually saving the best model using a callback to use model.save_pretrained() as follows (subsequently via load_pretrained best model is loaded manually):

`

  best_acc = -1
  class CustomCallback(TrainerCallback):
      def __init__(self, trainer) -> None:
          super().__init__()
          self._trainer = trainer

      def on_epoch_end(self, args, state, control, **kwargs):
          if control.should_evaluate:
            global best_acc, model, save_best_model_path
            print('USING HEAD: ', model.active_head)
            control_copy = deepcopy(control)
            output_metrics = self._trainer.evaluate(eval_dataset=self._trainer.train_dataset, metric_key_prefix="train@"+lang)
            acc = output_metrics['acc']
            if state.global_step < state.max_steps and best_acc<=acc:
                print('Saving the model using CustomCallback: ', save_best_model_path)
                model.save_pretrained(save_best_model_path, from_pt=True)
                best_acc = acc
            return control_copy

training_args = TrainingArguments(
        seed=SEED,
        evaluation_strategy="epoch",
        save_strategy="epoch",
        logging_strategy="epoch",
        learning_rate=lr,
        num_train_epochs=epoch2,
        per_device_train_batch_size=batch_size2,
        per_device_eval_batch_size=batch_size2,
        output_dir=MODEL_DIR+'_'+lang,
        overwrite_output_dir=False,
        # The next line is important to ensure the dataset labels are properly passed to the model
        remove_unused_columns=False,
        save_total_limit=1,
        load_best_model_at_end=True,
        # resume_from_checkpoint=MODEL_DIR+'/last-checkpoint',
    )

trainer = AdapterTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset2,
    eval_dataset=valid_dataset2,
    compute_metrics=compute_metrics
    )

trainer.add_callback(CustomCallback(trainer))`
ghost commented 1 year ago

It's possible to save the entire model using the transformers library. However, when loading the model for inference, the default configuration does not include additional parameters (adapters). As a result, the model will only perform inference on the non-adapter parameters. I haven't tried it myself but you should attempt to inference the model while specifying a custom config and see if that works out.

erzaliator commented 1 year ago

@Celestinian , on the contrary, from the library behavior, the adapter library saves only the adapter modules and heads but not bert weights. As a result, the model attempts to load only adapter weights. I suspect the trainer.state variable needs to be corrected to store both adapter and bert weights.

Ch-rode commented 1 year ago

@erzaliator Thank you for bringing this out. I'm sorry to hear that you're experiencing the same frustrating issue. However, when I attempted to use your suggested callback method, I encountered the following error: AttributeError: 'str' object has no attribute 'evaluate' @calpt do you have any suggestion?

erzaliator commented 1 year ago

@Ch-rode can you check if the value of Trainer is correct? It should not be initialized as a string object.

Additionally, these are the model details (complete code is here . It does use some custom python files as imports but maybe it could be helpful as reference. With my setup this code is able to save the model including the adapter modules and heads):

from transformers import AutoConfig, AutoAdapterModel
from transformers import AdapterConfig

lang1 = 'en'
dataset1 = 'eng.rst'
lang2 = 'de'
dataset2 = 'deu.rst'

config = AutoConfig.from_pretrained(
    BERT_MODEL,
)
model = AutoAdapterModel.from_pretrained(
    BERT_MODEL,
    config=config,
)

# Load the language adapter
lang_adapter_config = AdapterConfig.load("pfeiffer", reduction_factor=2)
model.load_adapter(lang1+"/wiki@ukp", config=lang_adapter_config)

# Add a new task adapter
model.add_adapter("disrpt")

# Add a classification head for our target task
num_labels = len(set(labels1.names))
head_name = "disrpt-"+dataset1.replace('.', '-')
print('Total prediction labels: ', num_labels)
model.add_classification_head(head_name, num_labels=num_labels)

# set trainable adapter
model.train_adapter(["disrpt"])

# Unfreeze and activate stack setup
lang = lang1
model.active_adapters = Stack(lang, "disrpt")
model.active_head = head_name
lang = dataset1

Additional info: adapter-transformers==3.2.1 transformers==4.20.1 torch==1.12.1+cu102

vabatta commented 10 months ago

@erzaliator I'm missing a piece, are you saying it is possible to save a model with an adapter to be later loaded for inference without using the adapter library?