NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
5.07k stars 615 forks source link

Information about fn.hue and fn.hsv methods #4611

Closed pietroorlandi closed 7 months ago

pietroorlandi commented 1 year ago

Hi, I wanted to ask you a piece of information that is not clear to me. When I use the transformation fn.hue and fn.hsv resulting in different transformations to the corresponding ones in opencv. My pipeline is:

@pipeline_def
def my_pipeline():
    images, labels= fn.readers.file(file_root=r"../Images") 
    images = fn.decoders.image(images)
    images = images.gpu()
    labels = labels.gpu()
    augmented = fn.copy(images)
    augmented = fn.hsv(augmented, hue=-18., saturation=1., value=1.)  # or fn.hue(augmented, hue=-18.)
    return images, augmented

and the code in opencv is this

img_opencv = cv2.imread("../Images/0/img.png")
im = cv2.cvtColor(img_opencv, cv2.COLOR_BGR2HSV)
im[:, :, 0] = (im[:, :, 0] - 18.) % 180  
im = cv2.cvtColor(im, cv2.COLOR_HSV2RGB)

If I plot the two images (one obtained with Dali pipeline and other one obtained with opencv), they are a little different. The fn.hsv documentation says "For performance reasons, the operation is approximated by a linear transform in the RGB space." The reason of the difference of the two images is this? Is it possible to achieve the exact same transformation as that obtained with opencv (considering that the hue transformation should be random)?

mzient commented 1 year ago

Yes, the approximation is exactly the reason why the results are different. Another reason is that the hue, saturation, brighness are compatible with color_twist operator which implements a linear transformation of color. There are no readily available operations that would perform exact RGB <-> HSV conversion in DALI. They could be implemented with elementary math operations, although implementing HSV fully in terms of tensor math isn't straightforward (there are a lot of if-s there). In any case, if this is to be a data augmentation, I don't think this rather small difference would matter for your model's performance unless you train it to specifically do something related to HSV color model.

Thanks, Michal

pietroorlandi commented 1 year ago

Okay, so from what I understand you can't even use opencv methods within the pipeline, is that right? For example we can't use inside the Dali pipeline the method cv2.cvtColor(augmented, cv2.COLOR_BGR2HSV) considering that augmented is a DataNode. So the only alternative might be to define a custom hsv operation (however since it is full of if statements it could not be executed in GPU).

JanuszL commented 1 year ago

Hi @pietroorlandi,

You can follow this guide to see how to use external libraries inside DALI operators. You can start from the Python operator (slowest but easiest to prototype), through Numba compiled CPU function till a full custom DALI operator that calls any native library or CUDA kernel directly. The if statements don't preclude the use of GPU and a custom kernel. We mentioned them only because they'd make a kernel-built tensor math very inefficient, but writing them as per-pixel operations in a CUDA kernel should be OK - just not as fast as a linear transform.