https://github.com/google-research/big_vision/blob/01edb81a4716f93a48be43b3a4af14e29cdb3a7f/big_vision/pp/autoaugment.py#L209-L213
is supposed to be the mean pixel value, but as it is it's just summing over the histogram (therefore equal to height width), divided by 256. For the standard decode_jpeg_and_inception_crop(224), I have verified that mean is always 224 224 / 256 = 196. I have also created the following calibration grid to double-check the transform's behavior, with RGB values (192, 64, 64) for the reddish squares and (64, 192, 192) for the bluish squares:
As it is, contrast(tf_color_tile, 1.9) returns the following:
with RGB values (188, 0, 0) and (0, 188, 188). After the fix, contrast(tf_color_tile, 1.9) returns the following:
with RGB values (249, 6, 6) and (6, 249, 249), which is more in line with other implementations. E.g. the approximate torchvision equivalent
from torchvision.transforms.v2 import functional as F
F.adjust_contrast(torch_color_tile, contrast_factor=1.9)
I have created https://github.com/google-research/big_vision/pull/108 for demonstration purpose. In short: the
mean
herehttps://github.com/google-research/big_vision/blob/01edb81a4716f93a48be43b3a4af14e29cdb3a7f/big_vision/pp/autoaugment.py#L209-L213 is supposed to be the mean pixel value, but as it is it's just summing over the histogram (therefore equal to height width), divided by 256. For the standard
decode_jpeg_and_inception_crop(224)
, I have verified thatmean
is always 224 224 / 256 = 196. I have also created the following calibration grid to double-check the transform's behavior, with RGB values (192, 64, 64) for the reddish squares and (64, 192, 192) for the bluish squares:As it is,
with RGB values (188, 0, 0) and (0, 188, 188). After the fix,
with RGB values (249, 6, 6) and (6, 249, 249), which is more in line with other implementations. E.g. the approximate torchvision equivalent
contrast(tf_color_tile, 1.9)
returns the following:contrast(tf_color_tile, 1.9)
returns the following:returns RGB values (250, 6, 6) and (6, 250, 250).