The conversion we have with litgpt convert to a huggingface checkpoint creates a model.pth file. But then you have to load it like so as described in the tutorial:
import torch
from transformers import AutoModel
state_dict = torch.load("output_dir/model.pth")
model = AutoModel.from_pretrained(
"output_dir/", local_files_only=True, state_dict=state_dict
)
But we should make it work like this:
model = AutoModel.from_pretrained("output_dir")
The only blocker for this is that from_pretrained requires the pytorch_model.bin to be loaded with weights_only=True. Our checkpoints don't satisfy this constraint, because we save checkpoints using the incremental pickle save. See #1357 for more context where we had to work around this.
The conversion we have with
litgpt convert
to a huggingface checkpoint creates a model.pth file. But then you have to load it like so as described in the tutorial:But we should make it work like this:
The only blocker for this is that
from_pretrained
requires thepytorch_model.bin
to be loaded withweights_only=True
. Our checkpoints don't satisfy this constraint, because we save checkpoints using the incremental pickle save. See #1357 for more context where we had to work around this.