Open gaocegege opened 4 years ago
It is related to our python SDK. Ref #49
There are roughly two ways to consider:
Transform all models into a unified type (eg onnx. Use onnxruntime to provide inference). At this time, the task needs some resources and dependencies. It is recommended to put it in model-rejistry.
I think we are using this approach in model registry (triton inference server). But we wanna support offline inference here.
Transform all models into a unified type (eg onnx. Use onnxruntime to provide inference). At this time, the task needs some resources and dependencies. It is recommended to put it in model-rejistry.
I think we are using this approach in model registry (triton inference server). But we wanna support offline inference here.
We use the converted model for offline inference. But we need convert out model first ,maybe use model registry。 If we use only one model type, we can provide only one library for #40
Personally prefer the latter.
If we can unify the API on top of models, we can support multiple framework formats. If we wanna support the offline inference, we always need an SDK, I think.
/assign
Is this a BUG REPORT or FEATURE REQUEST?:
What happened:
Investigate if we can use https://github.com/uber/neuropod to provide a unified offline batch inference interface for users. They can use ormb python sdk to download the model first then use neuropod to run offline inference.
Thank @terrytangyuan for introducing the project.
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?: