NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
5.17k stars 622 forks source link

Add more choices for data augmentation #1610

Closed kaleidoscopical closed 1 year ago

kaleidoscopical commented 4 years ago

Hi! Thanks for providing such an awesome codebase.

I wonder whether it is in the schedule that the augmentation introduced in the paper "AutoAugment: Learning Augmentation Policies from Data" will be included in the augmentation gallery.

The augmentations include equalizes the image histogram, inverts the pixels of the image, adjusts the sharpness of the image, and so on.

Since these augmentations have been widely used in recent SOTA papers, it would be great if all of them could be supported.

jantonguirao commented 4 years ago

Hi @kaleidoscopical. That is a very good request!

At this moment we don't have a schedule to provide an implementation for those augmentations. Could you please provide a list of those augmentations that are not currently supported by DALI and are used in some state-of-the-art papers? This could help us to give this request a higher priority.

JanuszL commented 4 years ago

I think inverts the pixel is possible now using the arithmetic operators - just

def define_graph(self):
    (...)
    image = 256 - image
    return image
bonlime commented 4 years ago

@kaleidoscopical if you look closely into the paper, the main improvement comes from simple augmentations, not the extravagant ones like Histogram Equalization. So I would question if it's worth the time spent implementing it.

But I would suggest adding Shear augmentation. It is possible to implement it now using Affine transform but it's a messy way.

twmht commented 2 years ago

any update on this? it would be great if dali can support AutoAugument and RandAugment.

jantonguirao commented 2 years ago

@bonlime

But I would suggest adding Shear augmentation. It is possible to implement it now using Affine transform but it's a messy way.

We do have https://docs.nvidia.com/deeplearning/dali/user-guide/docs/operations/nvidia.dali.fn.transforms.shear.html, which produces an affine matrix that can be used with fn.warp_affine. This should be easy to use.

jantonguirao commented 2 years ago

@twmht We are currently working on some groundwork that will enable auto-augment like capabilities in the mid-term. However we are not there yet.

rjbruin commented 2 years ago

I very much second the request for RandAugment support. It has become a cornerstone of Vision Transformer data augmentations. Support for RandAugment would make DALI a significant competitor to the data augmentations in Timm.

jramapuram commented 2 years ago

+1 for RandAug / TrivialWideAug :)

klecki commented 1 year ago

Hi @kaleidoscopical, @twmht, @rjbruin, @jramapuram starting with DALI 1.24 we support AutoAugment, RandAugment and TrivialAugment. DALI 1.25 released just few days ago includes further improvements and additional policies.

You can read more in the nvidia.dali.auto_aug module documentation available here: https://docs.nvidia.com/deeplearning/dali/user-guide/docs/auto_aug/auto_aug.html

There is also an example adopted from Deep Learning Examples repository showcasing the usage of AutoAugment with DALI for EfficientNet training: https://docs.nvidia.com/deeplearning/dali/user-guide/docs/examples/use_cases/pytorch/efficientnet/readme.html

jramapuram commented 1 year ago

Amazing work @klecki 🙏

Quick question: does setting enable_conditionals=True have an impact on the performance of the computation graph? Or should I just interpret this as doing the equivalent of what we had with muxing?

stiepan commented 1 year ago

Hi @jramapuram,

The conditional execution works by splitting the batch into mini-batches. The execution graph diverges in two subgraphs for the if/else branches and the two mini batches are formed according to the predicate in the if condition. Then, the differently processed minibatches are merged back into a single batch. This way we do not need to perform unnecessary computations like in the muxing case, but at the same time still can benefit from batched execution, just for the minibatches.

We try to keep the benefits from the batching wherever possible. For example, in case of the AutoAugment and the ImageNet policy for EfficientNet, even though it has 25 sub-policies, we can split the computation into 9 (or so) branches, as there are only 8 different operators (plus the skip case) as first and second operation in each subpolicy.

jramapuram commented 1 year ago

Thanks for the detailed explanation @stiepan! I'll give the augmentations a bench.

Is it possible to use randaug with the previous object based Pipeline solution? I have quite a bit of legacy code for which I would like to minimize changes. I see this code -- is that the suggested solution? An exposed helper here would be appreciated if possible 🙏

stiepan commented 1 year ago

Hi @jramapuram,

As to the benchmark, absolutely, please give it a go!

Conditional execution, which is a backbone of the automatic augmentation, is supported only for the newer, functional API. The code you linked is a decorator dedicated to defining pipelines with the functional API. Unfortunately, combining it with the legacy definition of a pipeline likely won't work.

A more detailed explanation requires diving into conditional execution implementation. In terms of graph and execution, the conditionals work by splitting the processing into mini-batches. That is half of the story. The other half is defining the pipeline and the corresponding execution graph in Python. DALI uses fork of TensorFlow’s AutoGraph to track and translate if/else statements into sub-functions, so that it can intercept the branches, translate them into sub-graphs and add necessary splitting and merging. This step does not work well with the legacy class-oriented API and for that reason, we do not have a utility that supports conditional execution for legacy pipelines.

You can read more on conditional execution here: https://docs.nvidia.com/deeplearning/dali/user-guide/docs/examples/general/conditionals.html