Closed philschmid closed 11 months ago
Thank you for reaching out. We have this in the roadmap and will let you know when it is available.
Hi @aws-ennst @hannanjgaws @mmcclean-aws I had few things to clarify:
Edit1: Sorry team, my mistake, I should not compare transformers-neuronx with torch-neuronx. I missed the fact that transformers-neuronx library is not just for tensor parallel but also for Autoregressive task. I think it solves both of my above questions. Will wait for the serialization support, thanks in advance.
Latest release includes serialization: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release, please take a look and see if this matches your model of interest: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/transformers-neuronx/transformers-neuronx-developer-guide.html#serialization-support-beta
Hello, is there any easy way to add serialization to models other than GPT2? GPT2 has a
_save_compiled_artifacts
method to save compiled artifacts to disk and load. That would be convenient for other models as well since compiling, e.g. GPT-J takes, 5-10 minutes.I looked at the code but it seems there was a design change.