kleveross / ormb

Docker for Your ML/DL Models Based on OCI Artifacts
Apache License 2.0
461 stars 61 forks source link

[feature] Provider unified offline batch inference interface #47

Open gaocegege opened 4 years ago

gaocegege commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug /kind feature

What happened:

Investigate if we can use https://github.com/uber/neuropod to provide a unified offline batch inference interface for users. They can use ormb python sdk to download the model first then use neuropod to run offline inference.

Thank @terrytangyuan for introducing the project.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

gaocegege commented 4 years ago

It is related to our python SDK. Ref #49

judgeeeeee commented 4 years ago

There are roughly two ways to consider:

gaocegege commented 4 years ago

Transform all models into a unified type (eg onnx. Use onnxruntime to provide inference). At this time, the task needs some resources and dependencies. It is recommended to put it in model-rejistry.

I think we are using this approach in model registry (triton inference server). But we wanna support offline inference here.

judgeeeeee commented 4 years ago

Transform all models into a unified type (eg onnx. Use onnxruntime to provide inference). At this time, the task needs some resources and dependencies. It is recommended to put it in model-rejistry.

I think we are using this approach in model registry (triton inference server). But we wanna support offline inference here.

We use the converted model for offline inference. But we need convert out model first ,maybe use model registry。 If we use only one model type, we can provide only one library for #40

gaocegege commented 4 years ago

Personally prefer the latter.

If we can unify the API on top of models, we can support multiple framework formats. If we wanna support the offline inference, we always need an SDK, I think.

judgeeeeee commented 4 years ago

/assign