NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
5.06k stars 615 forks source link

Support for a neural compression decoder #3228

Open lbhm opened 3 years ago

lbhm commented 3 years ago

I recently started reading into neural/learned compression and implemented a simple CNN training pipeline using image data that I previously encoded/compressed with a neural compression model. Neural compression can achieve better compression factors than "classic" codecs and therefore be interesting for DNN training if, for example, main memory is limited and reading data from disk during training is too slow.

In my small test, I extended the PyTorch Dataset class and loaded a pretrained model from CompressAI into my Dataset to return images to the data loader. Unfortunately, in my pipeline the decoding time of the neural codecs I used were quite bad compared to JPEG and alike.

Therefore, I am wondering how neural decompression in DNN training pipelines can be made faster. One of my thoughts was if it would be possible to add support for a neural compression decoder op to DALI. From a user facing perspective I would envision an operator like nvidia.dali.fn.decoders.neural_codec() that takes a path to a trained decoder model checkpoint, maybe some type hints, and other common decoder args. The op would then instantiate the model and convert data passed from a reader op into tensors for downstream preprocessing.

I'd be curious to hear how feasible you see it to integrate a neural decoder into DALI. From a high-level perspective, I would consider it very helpful for anyone without unlimited main memory if more efficient neural compression methods were available as a drop-in replacement for classic codecs. Potential issues that I see based on my (very limited) knowledge of DALI internals are:

Looking forward to your opinion!

JanuszL commented 3 years ago

Hi @lbhm,

This sounds like a very interesting problem. The first thing that comes to my mind is a POC made in 2019 and described in "Integration of DALI with TensorRT on Xavier GTC19 talk" (slides. recording), where a custom operator was created for DALI that was running inference with TRT under the hood. If you want to learn more about it here. TRT sound like the most Deep Learning Framework agnostic approach and GPU accelerated at the same time, however, you need to export the model to a format it understands. If you want to use a specific FW this may be much harder as probably a Python operator would be the most feasible option to implement. To summarize I don't think this should be made a part of the DALI code base, still, it is a good idea, and creating a custom, runtime loaded operator would be the best way to go (as it would not create any additional dependency to DALI).

lbhm commented 3 years ago

Hi @JanuszL,

Sorry for the late response. I have looked at the GTC talk you referenced - integrating TensorRT into a DALI pipeline sounds very interesting! Is there any chance the custom op was open sourced somewhere?

JanuszL commented 3 years ago

Hi @lbhm,

Can you check with the speakers directly Anurag or Josh (their mails are visible on the first page of the mentioned talk)?

lbhm commented 3 years ago

Ok, will do.