NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
5.18k stars 621 forks source link

Questions about image decoding #4390

Closed Renzzauw closed 2 years ago

Renzzauw commented 2 years ago

Hi!

I'm using DALI to speed up our image & video data loading processes and to accelerate our processes using the GPU. I'm specifically researching into speeding up some image loading processes, so I have a couple of questions I hope I can find the answers to here and to see DALI is suitable for my application.

Question 1: When I look at the examples of the fn.decoders.image() image decoder in the docs, the examples appear to first use the fn.readers.file() operation to obtain the file paths and labels. In my use case, we simply want to use DALI for decoding images, so my images do not have any labels. Is there a specific way I can use this operator or pass the file names directly to this decoder (similar to fn.readers.video() where you can pass the filenames argument)?

Question 2: I'm looking for a way to quickly load/decode image assets to the GPU. These will not per se be large directories of images, but I would like to occasionally load a single / a few image(s) to the GPU instead of a large directory of images. Will using DALI be 'overkill' here or have too much overhead or would it make sense to still use DALI for tasks involving relatively small decoding tasks? If not, what would be an alternative for such tasks?

Question 3: In the documentation I read that decoding PNGs using the fn.decoders.image() will not be accelerated by the GPU. Would it still make sense to use DALI for decoding sequences of PNGs on the CPU and then transferring them to the GPU or would it be wiser to look for alternative techniques? I found some CUDA-accelerated solutions online (e.g. nvjpeg, nvjpeg2000, nvtiff, etc.), but I was not able to find anything that accelerates decoding of PNGs.

Many thanks in advance for answering these questions!

JanuszL commented 2 years ago

Hi @Renzzauw,

Let me answer your questions one by one: 1) You can still use the file reader operator, and pass the file names to files argument. If you don't need the labels do:

 jpegs, _= fn.readers.file(files=["aaa", "bbb"])

or use the external source operator.

2) You can still try out DALI.

3) I'm sorry, but we don't have any library that would accelerate PNG decoding on the GPU. Now DALI approach should be as fast as OpenCV. If you want to process the data further on the GPU DALI is the right approach. If it is only about decoding DALI may not be the tool you are looking for.

On top of that you can try out experimental eager mode (it is not fully finished, so it is not documented):

import numpy as np
from nvidia.dali.experimental import eager
from nvidia.dali import tensors

jpegs = np.fromfile("test_image.jpeg", dtype=np.uint8)
sample = tensors.TensorCPU(jpegs)
jpegs = tensors.TensorListCPU([sample]) # one in this example but you can add more to the list
print(eager.decoders.image(jpegs, device="gpu"))
Renzzauw commented 2 years ago

Hi @JanuszL,

Thanks for providing these extensive answers, these are very helpful!