keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1k stars 333 forks source link

Vectorize AdjustSaturation #1378

Open LukeWood opened 1 year ago

LukeWood commented 1 year ago

Based on PR https://github.com/keras-team/keras-cv/pull/1373 we would like to vectorize AdjustSaturation.

One issue with this is that tf.image.adjust_saturation() does not accept a batch argument for contrast adjustment, so we'll have to roll a pure TF version.

Contribution encouraged on this issue

sebastian-sz commented 1 year ago

@LukeWood

so we'll have to roll a pure TF version

Could you clarify? Do you mean this should be a custom C++ op, similar to PairwiseIoUOp?

bhack commented 1 year ago

@LukeWood

so we'll have to roll a pure TF version

Could you clarify? Do you mean this should be a custom C++ op, similar to PairwiseIoUOp?

Is that in the original tf.image API and the relative kernel we have a constrain about the scale parameter to be a scalar:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/image/adjust_saturation_op.cc#L62

So we are going to apply the same scale to all the batch instead of the within the batch scale randomization policy.

Also if we will not break the tf.image public API as we could be scalar and tensor compatible at the same time in the public TF python API I think nobody have the resources/will or any other kind of intrinsic/extrinsic motivation to PR the tf.image refactoring these kernels to handle tensor params (/cc @cantonios).

cantonios commented 1 year ago

does not accept a batch argument for contrast adjustment

I'm not certain I understand what this means. It is assumed the last dimension are the three channels of RGB, and it will apply the (constant) saturation scale across all pixels and batches.

So we are going to apply the same scale to all the batch instead of the within the batch scale randomization policy.

So you want to apply a separate scale factor per image? It should be easy enough to allow the scale factor to be a tensor, as long as it is broadcastable to the size of the input (excluding the channel dimension). This should allow you to specify a scale per batch with a scale of size [nbatch, 1, 1, 1]. It's not high priority for us, but I'd be happy to accept a PR along those lines.

bhack commented 1 year ago

Thanks @cantonios. This is also true for other ops in tf.image namespace. I suppose the problem is more related to the "barriers" between projects. When an issue emerge downstream but you need to invest resources to contribute upstream it is generally hard.

So as in this case, with some TF image ops for preprocessing, it is solved with workarounds at downstream level.

bhack commented 1 year ago

P.s. @cantonios Just in case you are interested we have a similar issue with the MLIR bridge path on tf.ImageProjectiveTransformV3 (https://github.com/tensorflow/tensorflow/pull/55335/files#r832749400). What is not clear is if instead of rewriting the kernels and modifying the API we could achieve a fusion using the same operator in a loop when compiling with the new bridge.

LukeWood commented 1 year ago

@LukeWood

so we'll have to roll a pure TF version

Could you clarify? Do you mean this should be a custom C++ op, similar to PairwiseIoUOp?

Good question! No, I mean implement a saturation adjust via multiply/add/subtract/divide operations to implement this. There are examples of this logic implemented in numpy around the internet - so I think this is pretty feasible on Tf.

LukeWood commented 1 year ago

So you want to apply a separate scale factor per image? It should be easy enough to allow the scale factor to be a tensor, as long as it is broadcastable to the size of the input (excluding the channel dimension). This should allow you to specify a scale per batch with a scale of size [nbatch, 1, 1, 1]. It's not high priority for us, but I'd be happy to accept a PR along those lines.

Precisely.

srikesh-07 commented 1 year ago

Hello @LukeWood , I would like to contribute to this issue. Can you please let me know, if the below algorithm is aligned as per our expectations.

Input:
images: Tensor batch of images (batch_size, width, height, 3)
scale_factor: Tensor of size (batch_size, 1, 1, 1)  

Algorithm:
1. Convert input RGB tensor to float.
2. Convert float RGB tensor to HSV.
3. Multiply each channels of HSV with scale factor.
4. Clip output to (0-255).
5. Convert HSV to RGB format and return Tensor.

Returns:
Tensor batch of images (batch_size, width, height, 3)
LukeWood commented 1 year ago

That looks correct to me! We will need to numerically test this against the original before merging, but feel free to give it a go!

srikesh-07 commented 1 year ago

Working on it !

srikesh-07 commented 1 year ago

Hello @LukeWood,

I have some doubts that needs to be cleared from your side.

  1. I have hereby attached a Colab which demonstrates the numerical test between existing implementation and new draft implementation. When the scale value is set between 0.9 to 1.0 , the tests are passing at the rtol of 1e-05 and atol of 1e-05 even with bigger image sizes. So can you please have a suggestion on this part, whether can we have further more tests with different cases?

  2. I have a another doubt in implementation part, where a scale factor which is in shape (batch_size, 1) will only be passed during the initialization of layer and it should be applied to each batch of images passing through the layer. I have hereby attached the rough draft implementation to support above statement,

    
    class AdjustSaturation(VectorizedBaseImageAugmentationLayer):
    def __init__(self, scale_factor, **kwargs):
        super().__init__(**kwargs)
        self.scale_factor = scale_factor
        self.seed = seed
    
    def get_random_transformation_batch(self, batch_size, **kwargs):
        if self.scale_factor.shape[0] < batch_size:
            raise ValueError("Input batch of images is greater than batch size of scale factor")
        elif self.scale_factor.shape[0] > batch_size: # applies for remainder batch
            return self.scale_factor[:batch_size] 
        else:
            return self.scale_factor
    
    def augment_images(self, images, transformations, **kwargs):
        pass  # block for new implementation

else, 
Should each batch of images be passed as input with specific  `scale_factor` for each batch ?
LukeWood commented 1 year ago

Hello @LukeWood,

I have some doubts that needs to be cleared from your side.

  1. I have hereby attached a Colab which demonstrates the numerical test between existing implementation and new draft implementation. When the scale value is set between 0.9 to 1.0 , the tests are passing at the rtol of 1e-05 and atol of 1e-05 even with bigger image sizes. So can you please have a suggestion on this part, whether can we have further more tests with different cases?
  2. I have a another doubt in implementation part, where a scale factor which is in shape (batch_size, 1) will only be passed during the initialization of layer and it should be applied to each batch of images passing through the layer. I have hereby attached the rough draft implementation to support above statement,
class AdjustSaturation(VectorizedBaseImageAugmentationLayer):
    def __init__(self, scale_factor, **kwargs):
        super().__init__(**kwargs)
        self.scale_factor = scale_factor
        self.seed = seed

    def get_random_transformation_batch(self, batch_size, **kwargs):
        if self.scale_factor.shape[0] < batch_size:
            raise ValueError("Input batch of images is greater than batch size of scale factor")
        elif self.scale_factor.shape[0] > batch_size: # applies for remainder batch
            return self.scale_factor[:batch_size] 
        else:
            return self.scale_factor

    def augment_images(self, images, transformations, **kwargs):
        pass  # block for new implementation

else, Should each batch of images be passed as input with specific scale_factor for each batch ?

Thanks for the colab! I see where the confusion comes from now! scale_factor() is a keras_cv.core.FactorSampler(). This is called as a function, so your get_random_transformation_batch() is not quite right!

Does this help?

srikesh-07 commented 1 year ago

Does this help?

As far I understand, The scale_factor is a float tensor which is made of keras_cv.core.FactorSampler() with shape of (batch_size, 1) based on the input scale_limit, a float scalar value.

If it is quite right, then Can you please tell me the difference between the functionality of existing RandomSaturation ?

Sorry if I didn't understand correctly what you was trying to say. Can you please direct me if I misunderstood ? @LukeWood

Thank You !

srikesh-07 commented 1 year ago

Got the point, Working on it...

srikesh-07 commented 1 year ago

Hello , I have implemented the draft implementation of AdjustSaturation by vectorizing with VectorizedBaseImageAugmentationLayer. it could be viewed in this Colab Notebook.

  1. I have also crated a test for testing with the old implementation, but the test cases are failing even with low relative tolerance and absolute tolerance which is inversely proportional to the batch size.
  2. I have commented the code to return the tensor as tf.int32. If you want have a test workaround, just uncomment those lines and also please give suggestion on implementation of tf.float32 vs tf.int32,

Can you please have your suggestions and help me at his point?

Thank you !

@LukeWood @ianstenbit

james77777778 commented 1 year ago

Hi @srikesh-07 @LukeWood

Due to the fact that we already have vectorized RandomSaturation layer at https://github.com/keras-team/keras-cv/blob/master/keras_cv/layers/preprocessing/random_saturation.py I think the purpose of AdjustSaturation should be to adjust saturation without randomness, right?

If the statement is true, we can always sample same values for augmentation like this:

from keras_cv.utils import preprocessing as preprocessing_utils

batch_size = 8
factor = 0.9
constant_factor_for_sampler = (factor, factor)  # <-- makes it constant
sampler = preprocessing_utils.parse_factor(
    param=constant_factor_for_sampler, min_value=0.0, max_value=1.0
)
transformations = sampler(shape=(batch_size,))
print(transformations)
# $ tf.Tensor([0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9], shape=(8,), dtype=float32)

This should be helpful to implement __init__ in AdjustSaturation layer.

But it sounds weird...

We can properly set (same_float, same_float) to RandomSaturation layer to get the same behavior as AdjustSaturation. Why we need another layer that implements almost same functionality?

github-actions[bot] commented 8 months ago

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.