aws-neuron / transformers-neuronx

Apache License 2.0
99 stars 29 forks source link

Add Serialization to other models #14

Closed philschmid closed 11 months ago

philschmid commented 1 year ago

Hello, is there any easy way to add serialization to models other than GPT2? GPT2 has a _save_compiled_artifacts method to save compiled artifacts to disk and load. That would be convenient for other models as well since compiling, e.g. GPT-J takes, 5-10 minutes.

I looked at the code but it seems there was a design change.

aws-ennst commented 1 year ago

Thank you for reaching out. We have this in the roadmap and will let you know when it is available.

yogendra-yatnalkar commented 1 year ago

Hi @aws-ennst @hannanjgaws @mmcclean-aws I had few things to clarify:

  1. I know the saving of compiled model (i.e serialization) will come soon but wanted to confirm this:
  2. Using transformers-neuronx looks quite easy to use as compared to the above stable diffusion example (quite excited to try it out). Since we can serialize torch-neuronx models, can we make it tensor parallel manually? Is there any example around that?

Edit1: Sorry team, my mistake, I should not compare transformers-neuronx with torch-neuronx. I missed the fact that transformers-neuronx library is not just for tensor parallel but also for Autoregressive task. I think it solves both of my above questions. Will wait for the serialization support, thanks in advance.

mrnikwaws commented 1 year ago

Latest release includes serialization: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release, please take a look and see if this matches your model of interest: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/transformers-neuronx/transformers-neuronx-developer-guide.html#serialization-support-beta