Question about trainer.save_pretrained

🚀 The feature, motivation, and pitch

Here is the reply for #365 :

Assume that you having checkpoint output is best_checkpoint/pytorch_model/mp_rank_00_model_states.pt. I guess you can try something like this:
import torch
# Import the model architecture used during training and load the weights
from trlx.models.modeling_ppo import AutoModelForCausalLMWithHydraValueHead
model = AutoModelForCausalLMWithValueHead.from_pretrained("...")
model.load_state_dict(torch.load("best_checkpoint/pytorch_model/mp_rank_00_model_states.pt")["module"])
Another way that you can save your model directly to huggingface format by this refer this https://github.com/CarperAI/trlx#save-the-resulting-model-to-a-hugging-face-pretrained-language-model-ready-to-upload-to-the-hub.

And I have another question about this issue, when executing:

trainer = trlx.train(config=config, reward_fn=lambda samples, **kwargs: [float(int(sample)) for sample in samples])
trainer.save_pretrained('/path/to/output/folder/')

Is the trainer storing the last checkpoint or the best checkpoint? I suspect that it is the last checkpoint? If so, how can I save the best checkpoint so that I can load it using:

AutoModelForCausalLM.from_pretrained(path)

Alternatives

No response

Additional context

No response

CarperAI / trlx

Question about trainer.save_pretrained #412

🚀 The feature, motivation, and pitch

Alternatives

Additional context