Size of input tensor in ReformatImage function

NVIDIA-AI-IOT / cuDLA-samples

YOLOv5 on Orin DLA

Other

180 stars 17 forks source link

Closed dattv closed 1 month ago

dattv commented 8 months ago

Hi all.

I'm wondering that input resolution of yolov5 is 3x672x672, but why in https://github.com/NVIDIA-AI-IOT/cuDLA-samples/blob/a2d645b61920fead0cf70c79506518b0a159463c/src/matx_reformat/matx_reformat.cu#L146, the shape of mInput1 tensor was allocated as (1x16x672x672). (in case of using fp16) anybody known why??? @zerollzeng, @mchi-zg

zerollzeng commented 8 months ago

Because DLA takes chw16 inputs, which means we need to pad the input to NC/16HW16 first.

dattv commented 8 months ago

Because DLA takes chw16 inputs, which means we need to pad the input to NC/16HW16 first.

@zerollzeng Thank you for your feedback!, however there are some extra information I want to inquires,

Did you means 16 in chw16 is 16 Bit?
Because the image included 3 channel (RGB) so i thought the mInput1 tensor must be something like 1x3x672x672*sizeof(half)?
Is there any different between memory size of cuda variables and dla variables?

zerollzeng commented 8 months ago

No, the 16 means the C channel is padded to 16, So you actually have a 1x16x672x672 tensor, while only the first 3 channels represent the image.

lynettez commented 1 month ago

closing since no activity for several month, thanks all!