aws-neuron / transformers-neuronx

Apache License 2.0
88 stars 25 forks source link

save_split seems to be broken after transformers made safetensor serialization default #55

Closed jitto closed 4 months ago

jitto commented 8 months ago

Relevant TF PR - https://github.com/huggingface/transformers/pull/27064

save_split calls model.save_pretrained(save_directory, save_function=save_split, max_shard_size='10000GB') And since the default value for safetensor serialization is true, save_pretrained will not call the save_function.

You can reproduce by following https://github.com/aws-neuron/aws-neuron-samples/blob/master/torch-neuronx/transformers-neuronx/inference/meta-llama-2-13b-sampling.ipynb using latest transformer package

aws-rhsoln commented 8 months ago

We were able to reproduce the issue and the fix should be available in the upcoming release. For now, can you downgrade the transformers version using the following command: pip install "transformers<4.35.0"

jitto commented 8 months ago

We were able to reproduce the issue and the fix should be available in the upcoming release. For now, can you downgrade the transformers version using the following command: pip install "transformers<4.35.0"

Thanks for the quick response. I can confirm that save_split works fine with transformers v4.34.1

gsnaws commented 6 months ago

This is resolved with the recent 2.16 release. https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/torch/transformers-neuronx/index.html?highlight=safetensors#what-s-new-in-this-release