Unable to save pretrained model after finetuning : trainer.save_pretrained(modeldir) AttributeError: 'Trainer' object has no attribute 'save_pretrained'

keloemma commented 2 years ago

Environment info

transformers version: 4.8
Platform:
Python version: 3.9
PyTorch version (GPU?): 1.9.0
Tensorflow version (GPU?):
Using GPU in script?: yes (I called the used of gpu via a slurm sbatch script in Jean zay)
Using distributed or parallel set-up in script?:

Who can help @LysandreJik, @stas00, @sgugger

Information

Model I am using (Bert, XLNet ...): Flaubert model

The problem arises when using:

[x] the official example scripts: (give details below)

I tried to save best model after training and I get that error.

The tasks I am working on is:

[x] my own task or dataset: (give details below) My own dataset a dataframe with one column text and other column label with 0,1 or 2
To reproduce

Steps to reproduce the behavior:

I used the officiel notebook , I changed just the model name by Flaubert,
I define my own trainings arguments and trainer
I loop inside a directory to load my dataset, call trainer and then train , evalaute and save (this is where the error appears)


train_dataset = dataset.map(
            lambda x: tokenizer(x['verbatim'], padding="max_length", truncation=True, max_length=512), batched=True)

train_dataset.set_format(type='torch', columns=['input_ids', 'token_type_ids', 'attention_mask'])

tokenizer = FlaubertTokenizer.from_pretrained(model_name, do_lowercase=True)
model = FlauBertForSequenceClassification(config=mdl.config, num_labels=num_labels, freeze_encoder = False)
training_args = TrainingArguments(
            output_dir=output_dir,            # output directory
            num_train_epochs=1.0,               # total number of training epochs
            per_device_train_batch_size=8,   # batch size per device during training, can increase if memory allows
            per_device_eval_batch_size=8,    # batch size for evaluation, can increase if memory allows
            save_steps=50,                   # number of updates steps before checkpoint saves
            save_total_limit=2,               # limit the total amount of checkpoints and deletes the older checkpoints
            logging_first_step=True,            
            evaluation_strategy='epoch',      # evaluation strategy to adopt during training
            eval_steps=10,                   # number of update steps before evaluation
            #warmup_steps=50,                 # number of warmup steps for learning rate scheduler
            weight_decay=0.01,                # strength of weight decay
            logging_dir=logging_dir,            # directory for storing logs
            logging_steps=10,
            learning_rate=5e-5,
            load_best_model_at_end = True
            #save_strategy='no'
            )

trainer = Trainer(
                model=model,                         # the instantiated 🤗 Transformers model to be trained
                args=training_args,                  # training arguments, defined above
                train_dataset=train_dataset,         # training dataset
                eval_dataset=val_dataset,            # evaluation dataset
                tokenizer=tokenizer,                     
                compute_metrics=compute_metrics,
                #callbacks=[EarlyStoppingCallback(3, 0.0)] # early stopping if results dont improve after 3 epochs         
            )
modeldir = './path_to_save_model'
trainer.train()
trainer.save_pretrained(modeldir)
tokenizer.save_pretrained(modeldir)

For doing experiment I used jean zay

Expected behavior

stas00 commented 2 years ago

not sure where you took that code from, but indeed the Trainer doesn't have such method.

What you want is:

model.save_pretrained(modeldir)

sgugger commented 2 years ago

Or trainer.save_model(modeldir) which will call the method Stas mentioned.

anjanvb commented 1 year ago

Hey folks, i couldn't get save_model to work.

I am guessing push_to_hub isn't the only option we got, right? My trainer is

from setfit import SetFitModel, SetFitTrainer
from sentence_transformers.losses import CosineSimilarityLoss

# Load a SetFit model from Hub
model_id = "sentence-transformers/all-mpnet-base-v2"
model = SetFitModel.from_pretrained(model_id)

# Create trainer
trainer = SetFitTrainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    loss_class=CosineSimilarityLoss,
    metric="accuracy",
    batch_size=64,
    num_iterations=20, 
    num_epochs=1, 
)

# Train and evaluate
trainer.train()
metrics = trainer.evaluate()

stas00 commented 1 year ago

you're calling save_pretrained on the wrong object, please see my comment: https://github.com/huggingface/transformers/issues/14828#issuecomment-997433709 (edit: I see you edited your post to remove this failure).

wrt save_model we don't know what SetFitTrainer is. If you use the normal Trainer object it has save_model

https://github.com/huggingface/transformers/blob/5db9abde439bc02c3791da2a4fefee80d94d5b96/src/transformers/trainer.py#L2608

anjanvb commented 1 year ago

Sorry, I updated my comment i am using save_model (not save_pretrained) my trainer is as shown below. My assumption was SetFitTrainer is fundamentally of type Trainer? I could be wrong.

<setfit.trainer.SetFitTrainer at 0x7fe8b748e710>

stas00 commented 1 year ago

In my last comment I showed you that transformers.Trainer has save_model.

And I repeat I have no idea what setfit.trainer.SetFitTrainer is - it's not a transformers class.

anjanvb commented 1 year ago

Sorry, should've referred here about SetFit. I'll log an issue there. Thanks

stas00 commented 1 year ago

It is not a subclass of transformers.Trainer as far as I can see:

https://github.com/huggingface/setfit/blob/f777c2c60b270604dae0dc1db4eea815e8c9019d/src/setfit/trainer.py#L28

I suppose it looks like transformers.Trainer but it's a totally independent implementation. So you will have to ask for this feature at that other project.

or use model.save_pretrained(modeldir) which always works, since it's a feature of the transformers models.

anjanvb commented 1 year ago

Yeah, I realized that it wasn't a subclass.

anjanvb commented 1 year ago

So, I guess exports to openvino and onnx are supported for now. The only other ways I could think is using joblib or pickle. This worked -

import joblib

joblib.dump(trainer, './model/cstom-setfit-model.joblib')
trainer = joblib.load('./model/cstom-setfit-model.joblib')
trainer.model.predict(["text", "text"])

P.S.: model.save_pretrained works too.

stas00 commented 1 year ago

save_pretrained isn't just for saving/restoring objects on resume - its primarily use is to save just the parts that are needed to use from_pretrained and/or sharing the results with others. But otherwise your method with joblib for resumes is just fine.

Shruthipriya-BS commented 12 months ago

Hello, I am not able to save pytorch.bin and cofig files from trl import SFTTrainer trainer = SFTTrainer( model=model, train_dataset=dataset, peft_config=peft_config, # passing peft config dataset_text_field="text", # mentioned the required column args=training_arguments, # training agruments tokenizer=tokenizer, # tokenizer packing=False, max_seq_length=512 ) trainer.train() model_to_save = trainer.model.module if hasattr(trainer.model, 'module') else trainer.model # Take care of distributed/parallel training model_to_save.save_pretrained("outputs") tokenizer.save_pretrained('outputs') import gc gc.collect() trainer.save_model('modeldir') could you please help me how can I get that?

amyeroberts commented 12 months ago

Hi @Shruthipriya-BS,

If you believe there is a bug in the code, could you open a new issue? Please make sure to follow the issue template, providing a minimal code example we can use to reproduce the issue and information about the running environment. In the snippet in your comment, we don't have access to model or dataset so can't reproduce on our side. Code examples should also be in markdown code formatting i.e. between a pair of three backticks: ``` code goes here ```

huggingface / transformers