facebookincubator / AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Apache License 2.0
4.55k stars 369 forks source link

[FEATURE] Support for NVIDIA T4 (Turing Architecture) #12

Open philschmid opened 2 years ago

philschmid commented 2 years ago

Hello 🙋🏻‍♂️

It is very cool to see MetaAI going into inference optimization! This will help the community and companies to much long term speaking! While reading through the announcement blog post i noticed that

AITemplate is currently enabled on NVIDIA's A100 and AMD’s MI200 GPU systems, both of which are widely used today in data centers from technology companies, research labs, and cloud computing service providers.

Which awesome but might be a big limitation for many since A100 is still not very accessible. Having support for NVIDIA T4 (Turing), which is most widely available GPU in public clouds would be very helpful.

antinucleon commented 2 years ago

AITemplate is coming from Meta production needs, we don't have T4/V100 so in our first release we didn't consider about this. We will help to pass the voice to NVIDIA to see whether they can help.

harishprabhala commented 2 years ago

Hi @philschmid. There's another open source inference acceleration library called voltaML, which gives support for T4. Please check it out

https://github.com/VoltaML/voltaML