open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.69k stars 618 forks source link

[Feature Request] AWS Neuron backend #579

Open austinmw opened 2 years ago

austinmw commented 2 years ago

Describe the feature

I'd like to request adding AWS Neuron as a supported backend for deployment to AWS Inferentia chips.

Motivation

Inferentia chips are a new, very fast and cost effective cloud deployment option.

Related resources

PyTorch recently published a blog post describing how to use the Neuron SDK to export HuggingFace models for Inferentia serving: https://pytorch.org/blog/amazon-ads-case-study/

Here's a list of currently supported PyTorch operators: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/neuron-cc-ops/neuron-cc-ops-pytorch.html

lvhan028 commented 2 years ago

Hi, @austinmw sorry for replying late. We are interested in supporting more platforms and devices. Let us take some time to do research about AWS Neuron and I will get back to you asap.

lvhan028 commented 2 years ago

Hi, @austinmw I am afraid we won't be able to support AWS Neuron this year. The following devices are in our H2 TO-DO list:

austinmw commented 2 years ago

@lvhan028 Thanks for your reply. M1/M2 chip deployment is exciting!

Maybe can circle back to this in 2023? I think Inferentia chip adoption and workloads will increase in a very noticeable way by then.

lvhan028 commented 2 years ago

@lvhan028 Thanks for your reply. M1/M2 chip deployment is exciting!

Maybe can circle back to this in 2023? I think Inferentia chip adoption and workloads will increase in a very noticeable way by then.

@austinmw Do you have any data to prove that? If so, could you share it with me so that I can report it to my boss.

austinmw commented 2 years ago

@lvhan028 Thanks, I think a lot more information about this specifically will come from the upcoming AWS Silicon Innovation Day event.

Here are some currently published customer success stories. There are a few more major customer testimonials that will be made public in the next few weeks.

Also just a note on feasibility, I've been told that since you already have a TorchScript backend, the road to Neuron support is actually very short.