huggingface / optimum-neuron

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.

Apache License 2.0

176 stars 51 forks source link

Save checkpoint weights in safetensors format when exporting decoder models #568

Closed a-ys closed 2 months ago

a-ys commented 2 months ago

What does this PR do?

This PR changes model export to save model weights as .safetensors, rather than the Transformers-NeuronX split model format. Model loading is unchanged, as the from_pretrained() function in Transformers-NeuronX supports .safetensors natively in Neuron 2.18.

This fixes issue #565

Submitting this as a draft, because the changes are untested. I haven't been able to install the modified version of optimum-neuron from source.

a-ys commented 2 months ago

Did not see #567