This PR changes model export to save model weights as .safetensors, rather than the Transformers-NeuronX split model format. Model loading is unchanged, as the from_pretrained() function in Transformers-NeuronX supports .safetensors natively in Neuron 2.18.
This fixes issue #565
Submitting this as a draft, because the changes are untested. I haven't been able to install the modified version of optimum-neuron from source.
What does this PR do?
This PR changes model export to save model weights as .safetensors, rather than the Transformers-NeuronX split model format. Model loading is unchanged, as the from_pretrained() function in Transformers-NeuronX supports .safetensors natively in Neuron 2.18.
This fixes issue #565
Submitting this as a draft, because the changes are untested. I haven't been able to install the modified version of optimum-neuron from source.