DALI support - Githubissues

moskomule commented 5 years ago

Hi, any plan to integrate DALI (https://docs.nvidia.com/deeplearning/sdk/dali-developer-guide/docs/index.html) to torchvision for faster preprocessing? I found chainer tries to integrate it (https://github.com/chainer/chainer/pull/5067).

fmassa commented 5 years ago

Hi, Thanks for opening the issue. I'll have a look at this

moskomule commented 5 years ago

Thank you. These days I found image preprocessing parts are the bottlenecks. I'll try DALI by myself and report how it will make the processing fast.

sotte commented 5 years ago

albumentations is also a contender for faster image augmentation.

In my experience IO is actually worse than a "slow pre-processing" library. SSDs and NVMes(!) help a lot.

msaroufim commented 2 years ago

Hi @datumbox it's been a while since this PR had any discussions, I'm curious if there are any plans to make this happen?

datumbox commented 2 years ago

@msaroufim we are currently working to improve the Data loading process using PyTorch Data. We do not have immediate plans for integrating DALI directly at the moment but we can review this on the future. As we have very little resources, I think it's more realistic that such an investigation can happen after the release of the new Datasets API.

ccing @NicolasHug and @pmeier who lead the work on datasets.

msaroufim commented 2 years ago

Oh interesting so the way you'd integrate new backends in the future is to integrate them within torch.data? Also where can I learn more about the new Datasets API?

cc @VitalyFedyunin @ejguan @wenleix

pmeier commented 2 years ago

Oh interesting so the way you'd integrate new backends in the future is to integrate them within torch.data?

Not sure what you mean by "backends" here. In general you are right though. torchdata is the way to go for the new datasets.

Also where can I learn more about the new Datasets API?

There is no public document yet. However, we already have quite a large collection of datasets ported to the new structure. You can access them with torchvision.prototype.datasets.load(name), where name is the name of the dataset you want to load. For example

from torchvision.prototype import datasets

dataset = datasets.load("voc")

The dataset object is a regular IterDataPipe defined by torchdata. To transform it you can use the .map method. It takes a callable that will be executed for each sample in the dataset. This sample will be a dictionary with str keys. For example, a simple data pipeline could look like this:

from torchvision.prototype import transforms

transform = transforms.Compose(
    transforms.DecodeImage(),
    transforms.Resize(256),
    transforms.CenterCrop(256),
)

for sample in dataset.map(transform):
    ...

For everything else, please also have a look at the torchdata documentation.

abhi-glitchhg commented 2 years ago

Adding to @pmeier's comment, this tutorial might help you.

msaroufim commented 2 years ago

@pmeier to clarify by backend I mean one of these https://github.com/pytorch/vision#image-backend - i.e: pillow, accimage, pillow simd etc..

Overall the new interface for adding datasets looks good but I'm more curious about adding new backends like DALI. In particular DALI has some accelerated image processing kernels, accelerated image decoding which I think would be very useful to integrate in vision directly, feels too domain specific to be in torch.data IMHO and is similar enough to other backends like accimage to be in vision. What's the process like for adding a new backend? If it's similar to the one for accimage https://github.com/pytorch/vision/blob/main/torchvision/transforms/functional.py#L13 I can make a PR for this

The other option is to integrate the DALI data loader as a data pipe in torch.data

Here's a good primer on DALI and its value proposition https://cceyda.github.io/blog/dali/cv/image_processing/2020/11/10/nvidia_dali.html

@VitalyFedyunin @wenleix please chime in on where you think the most natural place for a DALI integration is

ejguan commented 2 years ago

The other option is to integrate the DALI data loader as a data pipe in torch.data

Thanks @msaroufim, I had the same feeling about making it as a separate DataPipe because it requires different behavior compared with datapipe.map like making sure this DataPipe only run on single process to prevent cuda context being copied around. It definitely needs more deeper look on DALI itself.

msaroufim commented 2 years ago

Seems like there's a good workaround too https://github.com/NVIDIA/DALI/issues/3081#issuecomment-866239816 - I'll take a more thorough look

pmeier commented 2 years ago

@msaroufim

to clarify by backend I mean one of these https://github.com/pytorch/vision#image-backend - i.e: pillow, accimage, pillow simd etc..

The new datasets will return a features.EncodedImage, which is a 1D uint8 tensor just storing the raw bytes. You can decode it however you want. Right now, transforms.DecodeImage() uses PIL as backend

https://github.com/pytorch/vision/blob/a8f563dbf8520020054aa01f5ae169999775fd19/torchvision/prototype/transforms/_type_conversion.py#L11-L17

https://github.com/pytorch/vision/blob/a8f563dbf8520020054aa01f5ae169999775fd19/torchvision/prototype/transforms/functional/_type_conversion.py#L13-L17

but you can use arbitrary backends there.

abhi-glitchhg commented 2 years ago

Similar issue on torchdata repo - https://github.com/pytorch/data/issues/761 Might be good to keep eye on this :)

pytorch / vision

DALI support #608