triton-inference-server / dali_backend

The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
MIT License
120 stars 28 forks source link

Unexpected large memory needed for gpu resize #188

Closed SingL3 closed 1 year ago

SingL3 commented 1 year ago

Description Unexpected large memory needed for gpu resize

Triton Information 2.32.0

Are you using the Triton container or did you build it yourself? yes

To Reproduce I am using a dali backend to process decoded images. I try to resize and crop images on gpu. i try this:

#  -*- coding:utf-8 -*-
import nvidia.dali as dali
from nvidia.dali.plugin.triton import autoserialize

@autoserialize
@dali.pipeline_def(batch_size=512, num_threads=8, device_id=0)
def pipe():
    images = dali.fn.external_source(device="cpu", name="DALI_INPUT_1")
    images = images.gpu() 
    images = dali.fn.resize(images, device="gpu", size=[
        224, 224], mode="not_smaller", interp_type=dali.types.INTERP_CUBIC, antialias=True)
    images = dali.fn.crop_mirror_normalize(images, device='gpu', mirror=False,
                                           dtype=dali.types.FLOAT, output_layout="CHW", crop=(224, 224),
                                           mean=[0.48145466, 0.4578275, 0.40821073], std=[0.26862954, 0.26130258, 0.27577711])
    return images

My input is a FP32 tensor with shape (256, 3, 512, 512) and I got this error when inference:

InferenceServerException: Runtime error: Critical error in pipeline:
Error when executing GPU operator Resize encountered:
Can't allocate 4489633333248 bytes on device 0.
Current pipeline object is no longer valid.

4489633333248B(4181.3 GB) is so big. Here the input only takes 0.75GB. Expected behavior Only small gpu memory is used.

kthui commented 1 year ago

Hi @SingL3, thanks for reporting the issue. I think it is related to the DALI TRITON Backend. I will transfer this issue there, as DALI backend engineers can better answer related issues.

szalpal commented 1 year ago

Hello @SingL3 !

For the fn.resize operator, data layout matters. Your input data has CHW layout, while by default Resize works on HWC layout. To properly use it, you shall set the layout of input data. Changing this line:

images = dali.fn.external_source(device="cpu", name="DALI_INPUT_1", layout='CHW')

should do the trick.

Cheers!

JanuszL commented 1 year ago

Without the @szalpal correction, you are attempting to enlarge a 3x512 image with 512 channels to 224x224 with 512 channels.

SingL3 commented 1 year ago

Thank you for your replies. That really works.