triton-inference-server / dali_backend

The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
MIT License
118 stars 28 forks source link

How to use scalar inputs #240

Open wq9 opened 3 months ago

wq9 commented 3 months ago

I'm trying to use a scalar input to resize a video, but can't figure out how to set the ndim parameter of external_source or the shape of the input in the client.

config.pbtxt

backend: "dali"
max_batch_size: 0

model_transaction_policy {
  decoupled: True
}

1/dali.py

import nvidia.dali as dali
from nvidia.dali.plugin.triton import autoserialize #must include

@dali.plugin.triton.autoserialize
@dali.pipeline_def(batch_size=32, num_threads=4, device_id=0, output_dtype=dali.types.FLOAT, output_ndim=3)
def pipeline():
    vid = dali.fn.experimental.inputs.video(name="INPUT", sequence_length=1, device='mixed')

    height = dali.fn.external_source(name="HEIGHT", ndim=1, dtype=dali.types.INT16, repeat_last=True)
    width = dali.fn.external_source(name="WIDTH", ndim=1, dtype=dali.types.INT16, repeat_last=True)

    vid = dali.fn.resize(vid, resize_x=width, resize_y=height, mode="not_larger") #resize
    vid = dali.fn.crop(vid, crop_w=width, crop_h=height, out_of_bounds_policy="pad") #pad
    vid = dali.fn.squeeze(vid, axes=0) #remove sequence dim
    vid = dali.fn.transpose(vid, perm=[2, 0, 1]) #HWC to CHW
    vid = dali.fn.cast(vid, dtype=dali.types.FLOAT, name="OUTPUT") #UINT8 to FP32
    return vid

client.py from video_decode_remap

...
        width = np.ones((1), dtype=np.int16)*640
        height = np.ones((1), dtype=np.int16)*360

        inputs = [
            tritonclient.grpc.InferInput("INPUT", video_raw.shape, "UINT8"),
            tritonclient.grpc.InferInput("WIDTH", width.shape, "INT16"),
            tritonclient.grpc.InferInput("HEIGHT", height.shape, "INT16"),
        ]
        inputs[0].set_data_from_numpy(video_raw)
        inputs[1].set_data_from_numpy(width)
        inputs[2].set_data_from_numpy(height)
...

If I run that, I get unexpected shape for input 'HEIGHT' for model 'resize_224'. Expected [-1,-1], got [1]. How do you properly set and get the scalar values in both client.py and dali.py?

banasraf commented 3 months ago

Hey @wq9

I think this should work when you add the batch dimension to the height and width inputs. So, assuming the batch size is 32 in your pipeline, the client code would look like:

width = np.ones((32, 1), dtype=np.int16)*640
height = np.ones((32, 1), dtype=np.int16)*360
wq9 commented 3 months ago

@banasraf Adding the batch dimension worked. Thanks!

However, when the input is a video (video_raw = np.expand_dims(np.fromfile(FLAGS.video, dtype=np.uint8), axis=0)), the last batch is not 32, so I get the error:

[/opt/dali/dali/pipeline/operator/operator.cc:43] Assert on "curr_batch_size == static_cast<decltype(curr_batch_size)>(arg.second.tvec->num_samples())" failed: 
ArgumentInput has to have the same batch size as an input.

Is there a way to pad the batch dimension?

banasraf commented 3 months ago

@wq9

Unfortunately this operator does not allow padding of the last batch. I don't see any workaround that would make your case work properly. The only options I see is hardcoding the width and height in the pipeline or if you know the number of frames in the sample, predicting when to send a partial width and height tensors.

I'll add a task to our backlog to extend the video input operator with the option to pad the last batch.