pytorch / multipy

torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters in a single C++ process.
Other
174 stars 35 forks source link

Queston: is there any examples of inferece with python models written with numpy (intead of pytorch models)? #296

Open KexinFeng opened 1 year ago

KexinFeng commented 1 year ago

I notice in the introduction that

torch::deploy (MultiPy for non-PyTorch use cases) is a C++ library that enables you to run eager mode PyTorch models in production without any modifications to your model to support tracing.

Also most of the examples provided are with PyTorch. Is there any examples of inferece with python-written models (intead of pytorch models)? For example, can I do inference here with xgboost or lightgbm or simple decision tree written in python with numpy?

PaliC commented 1 year ago

The main limitation to using non-PyTorch is actually how we receive and send values to the interpreter. We use IValue which is native to PyTorch. If you can get numpy to convert to and from IValue you should be able to accomplish this.

You can also add in a converter with our plugin registry https://github.com/pytorch/multipy/blob/main/multipy/runtime/interpreter/plugin_registry.h (and then add in support for the plug in).

Eventually we do hope to add a more generic interface for the interpreters. However, due to staffing issues this is a ways away :(

KexinFeng commented 1 year ago

Thanks for the answer! There is still one thing I'm a little confused about. Intuitively, it seems that a python script can still have numpy dependency. For example, here https://github.com/pytorch/multipy/blob/main/README.md#packaging-a-model-for-multipyruntime the numpy packages installed in the system will be searched.

Does this mean as long as I have numpy package installed, the python interpreters in MultiPy will load them during runtime and still be able to do multi-threading inference?

Also, I indeed found that NumPy is somewhat supported by MultiPy. For example, it can be imported with following code I.global("numpy", "random"), which is from https://github.com/pytorch/multipy/blob/19617b95252c97547e17d598c0b2cb03d0f4e936/multipy/runtime/test_deploy.cpp#L494-L504 Given this, I'm wondering why we still need to register the NumPy interface and

get numpy to convert to and from IValue

as you mentioned above.

PaliC commented 1 year ago

Sorry for closing this, here's the response I made to the reask of the question in #301

Yup this is exactly it. IValue isn't needed for the internals of the interpreter. We just use the type to interact with the interpreters. For numpy we haven't done thorough testing, so we can't provide any guarantees. Though you're right in that things should generally just work (IValue does cover a lot haha just not everything).

For the plugins/convertors (the interface I think you're referring to), currently we use IValue as an intermediary to convert a pyobject to something usable in C++. For example on line 501 you go from pyobject->IValue->int. However, eventually we'd like to create a custom convertor get more coverage.

Sorry to be more clear if IValue works for your use cases feel free to use it. However, if there are objects which you can't get out of IValue, you'd want to write your own convertor.