neo-ai / neo-ai-dlr

Neo-AI-DLR is a common runtime for machine learning models compiled by AWS SageMaker Neo, TVM, or TreeLite.
Apache License 2.0
491 stars 106 forks source link

Planned support for Neuron runtime? #447

Open michaelhagel opened 1 year ago

michaelhagel commented 1 year ago

Is support planned for NEFF and the respective runtime? Currently writing a Triton DLR backend, and having the option for a unified backend entrypoint to the neuron runtime if INF1 instances are specified would be very nice.

I know Neuron uses a TVM frontend, so I understand it is possibly best to just make a choice -- either use the raw TVM runtime exposed by DLR or compile your model via Neo, targeted at INF1 using Neuron. However, Neuron's usage of a TVM frontend is somewhat a blackbox, and doesn't allow directly passing TVM .so, etc. directly to neuron-cc. This limits use cases, such as classical ML models compiled via HummingbirdML to TVM.