Open lbhm opened 3 years ago
Hi @lbhm,
This sounds like a very interesting problem. The first thing that comes to my mind is a POC made in 2019 and described in "Integration of DALI with TensorRT on Xavier GTC19 talk" (slides. recording), where a custom operator was created for DALI that was running inference with TRT under the hood. If you want to learn more about it here. TRT sound like the most Deep Learning Framework agnostic approach and GPU accelerated at the same time, however, you need to export the model to a format it understands. If you want to use a specific FW this may be much harder as probably a Python operator would be the most feasible option to implement. To summarize I don't think this should be made a part of the DALI code base, still, it is a good idea, and creating a custom, runtime loaded operator would be the best way to go (as it would not create any additional dependency to DALI).
Hi @JanuszL,
Sorry for the late response. I have looked at the GTC talk you referenced - integrating TensorRT into a DALI pipeline sounds very interesting! Is there any chance the custom op was open sourced somewhere?
Hi @lbhm,
Can you check with the speakers directly Anurag or Josh (their mails are visible on the first page of the mentioned talk)?
Ok, will do.
I recently started reading into neural/learned compression and implemented a simple CNN training pipeline using image data that I previously encoded/compressed with a neural compression model. Neural compression can achieve better compression factors than "classic" codecs and therefore be interesting for DNN training if, for example, main memory is limited and reading data from disk during training is too slow.
In my small test, I extended the PyTorch
Dataset
class and loaded a pretrained model fromCompressAI
into myDataset
to return images to the data loader. Unfortunately, in my pipeline the decoding time of the neural codecs I used were quite bad compared to JPEG and alike.Therefore, I am wondering how neural decompression in DNN training pipelines can be made faster. One of my thoughts was if it would be possible to add support for a neural compression decoder op to DALI. From a user facing perspective I would envision an operator like
nvidia.dali.fn.decoders.neural_codec()
that takes a path to a trained decoder model checkpoint, maybe some type hints, and other common decoder args. The op would then instantiate the model and convert data passed from a reader op into tensors for downstream preprocessing.I'd be curious to hear how feasible you see it to integrate a neural decoder into DALI. From a high-level perspective, I would consider it very helpful for anyone without unlimited main memory if more efficient neural compression methods were available as a drop-in replacement for classic codecs. Potential issues that I see based on my (very limited) knowledge of DALI internals are:
Looking forward to your opinion!