Add Serialization to other models

philschmid commented 1 year ago

Hello, is there any easy way to add serialization to models other than GPT2? GPT2 has a _save_compiled_artifacts method to save compiled artifacts to disk and load. That would be convenient for other models as well since compiling, e.g. GPT-J takes, 5-10 minutes.

I looked at the code but it seems there was a design change.

aws-ennst commented 1 year ago

Thank you for reaching out. We have this in the roadmap and will let you know when it is available.

yogendra-yatnalkar commented 1 year ago

Hi @aws-ennst @hannanjgaws @mmcclean-aws I had few things to clarify:

I know the saving of compiled model (i.e serialization) will come soon but wanted to confirm this:
- This AWS Blog states that larger inf2 instance type is only required during compilation and further we can use smaller instance type for inference: https://aws.amazon.com/blogs/machine-learning/maximize-stable-diffusion-performance-and-lower-inference-costs-with-aws-inferentia2/
- But since we don't have the serialization support as of now, we will have to use a bigger instance type for compilation and continue using that for inference, right?
Using transformers-neuronx looks quite easy to use as compared to the above stable diffusion example (quite excited to try it out). Since we can serialize torch-neuronx models, can we make it tensor parallel manually? Is there any example around that?

Edit1: Sorry team, my mistake, I should not compare transformers-neuronx with torch-neuronx. I missed the fact that transformers-neuronx library is not just for tensor parallel but also for Autoregressive task. I think it solves both of my above questions. Will wait for the serialization support, thanks in advance.

mrnikwaws commented 1 year ago

Latest release includes serialization: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release, please take a look and see if this matches your model of interest: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/transformers-neuronx/transformers-neuronx-developer-guide.html#serialization-support-beta

aws-neuron / transformers-neuronx

Add Serialization to other models #14