libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.82k stars 178 forks source link

[WIP] Add RandAugment #154

Open ashertrockman opened 2 years ago

ashertrockman commented 2 years ago

Here's a draft PR to make visible our efforts to add a fast implementation of RandAugment [1] to ffcv.

We currently have committed the following transforms:

Ideally, it would be nice to have tests to ensure that our transforms are similar to some baseline (I've currently chosen PyTorch's torchvision.transforms.functional as this baseline).

Now that the transforms have been implemented, there's a few more things:

[1] https://arxiv.org/abs/1909.13719

GuillaumeLeclerc commented 2 years ago

I see that you are using numba.njit. I think it would be better to use Compiler.compile. This way we have a central location to disable compilation. It's quite convenient to debug issues

GuillaumeLeclerc commented 2 years ago

This is moving very fast :bullettrain_front: I was hoping to release v1.0.0 by the end of the week. Would you like it to be part of the release ? If yes could you change the target branch to v1.0.0 ?

ashertrockman commented 2 years ago

Sure, that sounds good. I'll try to finish a working demo soon.

ashertrockman commented 2 years ago

This still needs some testing, but it looks promising. In a brief experiment on CIFAR-10, this RandAugment implementation added +1% test accuracy and cost about 0.1s/epoch. The first epoch takes substantially longer (assuming for the extra memory allocation), adding about 20s.

tfriedel commented 2 years ago

I did install this and added it to a training pipeline I'm currently using. Got this error:

Exception in thread Thread-12:
ValueError: cannot assign slice from input of different size

The above exception was the direct cause of the following exception:

SystemError: _PyEval_EvalFrameDefault returned a result with an error set

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/thomas/conda/envs/datapurchase3/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/home/thomas/git/ffcv/ffcv/loader/epoch_iterator.py", line 84, in run
    result = self.run_pipeline(b_ix, ixes, slot, events[slot])
  File "/home/thomas/git/ffcv/ffcv/loader/epoch_iterator.py", line 143, in run_pipeline
    results = stage_code(**args)
  File "", line 2, in stage_code_0
SystemError: CPUDispatcher(<function RandAugment.generate_code.<locals>.randaug at 0x7fe5f4fe7670>) returned a result with an error set

I then disabled jit by setting the env variable NUMBA_DISABLE_JIT=1 and I get this more helpful error:

  File "/home/thomas/conda/envs/datapurchase3/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/home/thomas/git/ffcv/ffcv/loader/epoch_iterator.py", line 84, in run
    result = self.run_pipeline(b_ix, ixes, slot, events[slot])
  File "/home/thomas/git/ffcv/ffcv/loader/epoch_iterator.py", line 143, in run_pipeline
    results = stage_code(**args)
  File "", line 2, in stage_code_0
  File "/home/thomas/git/ffcv/ffcv/transforms/randaugment.py", line 78, in randaug
    translate(src[i], dst[i], 0, int(mag))
  File "/home/thomas/git/ffcv/ffcv/transforms/utils/fast_crop.py", line 171, in translate
    destination[ty:, :] = source[:-ty, :]
ValueError: could not broadcast input array from shape (156,160,3) into shape (28,32,3)

Seems like this is because I didn't call it with size set to the image size. I'll try that now. However in the imagenet training example the image size is scaled over the epochs. How would this work then? In any case, it's probably likely that users of this function will get the size wrong and the ouptut will not be helpful. I suggest to add some check for the correct size and a meaningful error message.

ashertrockman commented 2 years ago

Thanks for pointing this out. I'll look into automatically adapting the image size.

Did it work after fixing the image size in your example?

tfriedel commented 2 years ago

Yes it did work and added only a negligible slow down! Good work! However it didn't improve the accuracy in the example I tried. I'm going to run some more experiments.

ashertrockman commented 2 years ago

Great, thanks! Hopefully your experiments work out.

It looks like the size argument was unnecessary, and it should now work even when changing the image size mid-training.

vchiley commented 2 years ago

This is awesome! When will it be merged into master / a release?

ashertrockman commented 2 years ago

This is awesome! When will it be merged into master / a release?

Thanks! I'm not sure -- the plan was to merge it with release v1.0.0 (#160), but as far as I know, development on that release has slowed down for the time being.

andrewilyas commented 1 year ago

Hi @ashertrockman ! It seems I lost track of this PR a while ago - do you think it's feasible to merge into v1.1.0?

ashertrockman commented 1 year ago

Hi @ashertrockman ! It seems I lost track of this PR a while ago - do you think it's feasible to merge into v1.1.0?

Yeah, I think it should be fine to merge.

Abhinav95 commented 2 months ago

Great work on this @ashertrockman! I have had a successful training run with this fork (RandAugment ) with ImageNet-1k val acc > 80 with a ViT-B/16 model.

I think it would be valuable to merge this (and other similar augments like Colorjitter, Grayscale, 3Aug etc) because these are essential for any ViT runs.

ashertrockman commented 2 months ago

Great work on this @ashertrockman! I have had a successful training run with this fork (RandAugment ) with ImageNet-1k val acc > 80 with a ViT-B/16 model.

I think it would be valuable to merge this (and other similar augments like Colorjitter, Grayscale, 3Aug etc) because these are essential for any ViT runs.

Glad to hear!

ashertrockman commented 2 months ago

Great work on this @ashertrockman! I have had a successful training run with this fork (RandAugment ) with ImageNet-1k val acc > 80 with a ViT-B/16 model.

I think it would be valuable to merge this (and other similar augments like Colorjitter, Grayscale, 3Aug etc) because these are essential for any ViT runs.

By the way, if you're training ViTs, allow me to shamelessly promote my research: https://arxiv.org/abs/2305.09828