triton-inference-server / pytriton

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
https://triton-inference-server.github.io/pytriton/
Apache License 2.0
687 stars 45 forks source link

Pytriton don't nativly support pytorch or tensorflow dtype #40

Closed dahai331 closed 8 months ago

dahai331 commented 8 months ago

I was trying to use pytorch in pyTriton,

nvidia-pytriton           0.2.5                    pypi_0    pypi

I need to pass a tensor to server but python raise an error as follow.

TypeError: 'torch.dtype' object is not callable

I check about source code and here is the reason In file pytriton/model_config/generator.py

            if spec_.dtype in [np.object_, object, bytes, np.bytes_]:
                dtype = "TYPE_STRING"
            else:
                # pytype: disable=attribute-error
                dtype = spec_.dtype().dtype
                # pytype: enable=attribute-error 
                dtype = f"TYPE_{client_utils.np_to_triton_dtype(dtype)}"

which indicate only limited datatype are supported. Is there any way make it more comfortable for passing torch tensor to server than manually convert torch tensor into bytes? I wish pytriton could support more datatype in future version.

dahai331 commented 8 months ago

Even convert into byte cannot make it run as expected.

raceback (most recent call last):
  File "client.py", line 33, in <module>
    result_batch = client.infer_batch(y_dump,num)
  File "/home/***/mambaforge/envs/Triton/lib/python3.8/site-packages/pytriton/client/client.py", line 512, in infer_batch
    return self._infer(inputs or named_inputs, parameters, headers)
  File "/home/***/mambaforge/envs/Triton/lib/python3.8/site-packages/pytriton/client/client.py", line 539, in _infer
    if input_data.dtype == object and not isinstance(input_data.reshape(-1)[0], bytes):
AttributeError: 'bytes' object has no attribute 'dtype'

What should I do?

piotrm-nvidia commented 8 months ago

You can use the numpy() method of pytorch tensors to convert them into numpy arrays, and then pass them to the PyTriton client. Similarly, you can use the torch.from_numpy() function to convert numpy arrays back to pytorch tensors.

Here is an example of how to do that:

import torch
from pytriton.client import ModelClient
cl = ModelClient("http://localhost", "Test")
torch.from_numpy(cl.infer_sample(torch.zeros((2)).numpy())["tensor"])

This should give you output like this:

Out[23]: tensor([1., 1.])

This code will create a PyTriton client that connects to a server with the model name "Test". Then, it will create a pytorch tensor of zeros with shape (2,) and convert it into a numpy array. Next, it will pass the numpy array to the infer_sample() method of the client, which will send a request to the server and return a dictionary of outputs. Finally, it will extract the output tensor from the dictionary and convert it back to a pytorch tensor.

You can use simple Triton instance with infer_func to test your code. Here is an example of how to do that:

import numpy as np
import torch
from pytriton.decorators import batch

@batch
def _infer_fn(tensor):
    tensor = torch.from_numpy(tensor)
    tensor = tensor + 1
    return {"tensor": tensor.numpy()}

from pytriton.model_config import ModelConfig, Tensor
from pytriton.triton import Triton, TritonConfig
triton = Triton()
triton.bind(
    model_name="Test",
    infer_func=_infer_fn,
    inputs=[
        Tensor(name="tensor", dtype=np.float32, shape=(-1,)),
    ],
    outputs=[
        Tensor(name="tensor", dtype=np.float32, shape=(-1,)),
    ],
)

This code uses the @batch decorator to convert the input tensor to a pytorch tensor and then perform some computation on it. Then it converts the output tensor back to a numpy array and returns it. The input and output tensors are both defined as np.float32 dtype, which is compatible with PyTriton.

To run this code, you need to start the Triton server with triton.run() and then use the ModelClient class to send requests to the server.

I hope this helps you with your project. If you have any questions, please let me know.

dahai331 commented 8 months ago

Thank you so much for your reply, I solve my problem before seeing your answer. The reason I want to pass pytorch tensor through pyTriton is I tried to pass numpy data once before I open this issue, but it didn't work as expected.

tensor = tensor.numpy(force = True)

Above is how I convert tensor into numpy data.

tensor = torch.tensor(tensor)

Above is how I convert numpy data into tensor.

And in inference step, it comes some problems like tensor has no attribute dim().

Then I realize the convert step make tensor lost many attributes, that is the reason I wish pyTriton could support pytorch tensor as input dtype.

The method I found are similar to your reply, except tensor should convert to numpy as follow:

tensor = tensor.cpu().detach().numpy()

Which clearly is a more common method.

PyTriton is an absolutely awesome project, I really enjoy the convenience it brings to me.Thanks for making such great project.