Closed chengxuz closed 2 years ago
It should be working. I'll take a look!
PS: Torch augmentations are really slow especially if you run them before copying them on the GPU. We recommend using them only for experimentation purposes. If an augmentation is useful you should consider implementing a numpy version (and optionally share it with the community).
@chengxuz can you post what your full pipeline was? From the output you posted, it looks like you might be trying to apply a torchvision transform on a NumPy array (i.e., before calling ToTensor).
I did this to the ImageNet training pipeline (https://github.com/libffcv/ffcv-imagenet/blob/main/train_imagenet.py#L223):
image_pipeline: List[Operation] = [
self.decoder,
RandomHorizontalFlip(),
torchvision.transforms.ColorJitter(.4,.4,.4),
ToTensor(),
ToDevice(ch.device(this_device), non_blocking=True),
ToTorchImage(),
NormalizeImage(IMAGENET_MEAN, IMAGENET_STD, np.float16)
]
I was following this doc: https://github.com/libffcv/ffcv/blob/main/docs/making_dataloaders.rst#transforms, which I guess is out-dated now?
After moving the torchvision.transforms.ColorJitter(.4,.4,.4)
after the ToTorchImage function, I think it works! Is this the best location for this transform to be?
Anyway, thanks for your help! It would be great if you can also update the document.
Yeah, all of the torchvision transforms operate on PyTorch tensors, so they have to be put after ToTorchImage---we will add this to the documentation and make the error message a bit more descriptive :) Thanks!
To complement this, I'm also experimenting with a mixed ffcv/torchvision pipeline before writing the augmentations as ffcv and I noticed a weird behaviour. My pipeline is defined like this:
image_pipeline = [
RandomResizedCropRGBImageDecoder((crop_size, crop_size), scale=(min_scale, max_scale)),
RandomHorizontalFlip(flip_prob=horizontal_flip_prob),
ToTensor(),
ToDevice(device, non_blocking=True),
ToTorchImage(),
transforms.RandomApply(
[transforms.ColorJitter(brightness, contrast, saturation, hue)],
p=color_jitter_prob,
),
NormalizeImage(mean=mean, std=std, type=np.float16),
]
GPU memory oscillating between 10gb and 5/6gb. If I comment out the RandomApply
operation, it is stable at around 6gb. Are there any new allocations that are made?
Memory is only pre-allocated for FFCV transforms, so the torchvision transforms there are probably allocating memory at each iteration. Rewriting the torchvision transform as an FFCV one will fix this!
Just another followup on this, if the augmentation (like ColorJitter) is applied in this way, all images in the same batch will share the same augmentation dynamics, like the randomly determined color brightness, cotrast, saturation, and hue factors. This would be terrible for typical contrastive learning algorithms, so it seems that rewriting the transform as an FFCV one is indeed be needed.
Can you clarify what you mean @chengxuz with an example. This might not be expected behavior
For example, if I add this augmentation Random Grayscale. I would expect that the images within the same batch would randomly be turned into grayscale with the probability I specified, meaning one image might be grayscale, the other image might not. This is achieved in typical usage of this module as it will be applied to each PIL image independently, therefore yielding different behaviors for images even in the same batch. But the mechanism of this module when processing a batch of images is that the whole batch will be grayscale or not. This is because the module processes the whole batch as one unit to decide doing the grayscale or not. As FFCV applies the pipeline to batch of images, this then leads to the current behavior.
Haven't dived into FFCV's augmentations, but what nvidia dali does (and it also applies the pipeline to the whole batch) is using a multiplexing operation to decide which images to apply augmentations (https://docs.nvidia.com/deeplearning/dali/user-guide/docs/examples/general/expressions/expr_conditional_and_masking.html)
@chengxuz Do you have a reproduction script because reading the documentation from torchvision it seems that it's not what would happen. FFCV passes the batch as is to the augmentation, if it flips the coin on a per image basis or for the whole batch is unfortunately beyond our control (but from what I understand it's not what they are doing)
I personally checked and it seems that you are right @chengxuz. This sounds terribly unintuitive and imo should be reported to torchvision
. FFCV handles images per image so this should never be a problem though.
@vturrisi FFCV also has per-image randomness in its augmentations (so I think the only augmentations that don't support this are the torchvision ones).
Since it looks like all the FFCV-related problems here are solved, I'll close this issue for now---feel free to re-open if there's anything we missed!
@andrewilyas thanks for the comment above, but how can we rewrite below from transformers to FFCV since the latter doesn't have the counterpart of them (namely, RandomApply, ColorJitter, RandomGrayscale, GaussianBlur)?
And if mix using FFCV with transforms like below, it would slower the FFCV a lot by the previous discussions?
image_pipeline_q = [img_decoder,
RandomResizedCrop(),
RandomHorizontalFlip(),
ToTensor(),
ToDevice(device, non_blocking=True),
ToTorchImage(),
*custom_img_transforms,
transforms.RandomApply(
[transforms.ColorJitter(0.4, 0.4, 0.4, 0.1) # not strengthened
], p=0.8),
transforms.RandomGrayscale(p=0.2),
transforms.RandomApply([GaussianBlur([.1, 2.])], p=0.5),
NormalizeImage(dataset_mean * 255, dataset_std * 255, np.float16)]
# image_pipeline_k = ... the same as image_pipeline_q
pipelines = {'image_q': image_pipeline_q, 'image_k': image_pipeline_k}
where
class GaussianBlur(object):
"""Gaussian blur augmentation in SimCLR https://arxiv.org/abs/2002.05709"""
def __init__(self, sigma=[.1, 2.]):
self.sigma = sigma
def __call__(self, x):
sigma = random.uniform(self.sigma[0], self.sigma[1])
x = x.filter(PIL.ImageFilter.GaussianBlur(radius=sigma))
return x
I tried to add color jittering augmentation to the ImageNet training through inserting line
torchvision.transforms.ColorJitter(.4,.4,.4)
right afterRandomHorizontalFlip
, but met this error:Any idea on what's happening here and how to fix this?