tensorflow / addons

Useful extra functionality for TensorFlow 2.x maintained by SIG-addons
Apache License 2.0
1.69k stars 612 forks source link

Inconsistent APIs within tfa.image and between tf.image and tfa.image #1825

Closed WindQAQ closed 3 years ago

WindQAQ commented 4 years ago

Describe the feature and the current behavior/state. Currently, some functions in tfa.image that support data_format args like equalize, cutout, and random_cutout, while most of them do not. Moreover, all functions in tf.image support only channel-last images.

Therefore, I wonder if we can drop the support for data_format argument and channel-first image in tfa.image for some reasons:

Relevant information

Which API type would this fall under (layer, metric, optimizer, etc.) image

Who will benefit with this feature? developers who use tfa.image

Any other info.

bhack commented 4 years ago

I prefer to be uniform with tf.image and also with keras_image in the future if it will follow this same "standard" format.

bhack commented 4 years ago

Be consistent with tf.image: Usually, we compose multiple functions in tf.image and tfa.image. So if one of the function does not support channel-first image, there must exist at least one tf.transpose operation to change the data format. In this case, the support of channel-first image seems vague and useless as we have to explicitly change it into channel-last format in order for input pipeline.

I am not sure if we are uniform also about the casting operations.

seanpmorgan commented 4 years ago

Thanks for bringing this up! This much very much needs to be decided upon and standardized.

I also agree we should be consistant with what tf.image supports, but just want to confirm that NHWC is all that tf.image plans to support in the future. Seeing as keras has data_format args, I want to verify that support isn't on the roadmap for tf.image ops given the new Keras preprocessing pipelines.

@tanzhenyu Would you be able to confirm that our understanding is correct or point us to someone who could?

tanzhenyu commented 4 years ago

For conv ops: As of today, NHWC is what CPU only supports. In Cuda, it will mostly convert to NCHW first and call an efficient cudnn, except if it's float16 and > Volta V100, but this is done by default.

For tf.image ops: They're all based on NHWC format, in the kernel level. This is not something we can change at python level at all.

So yeah I think we should be consistent with tf.image

facaiy commented 4 years ago

+1, it sounds great to keep tfa.image simple enough.

WindQAQ commented 4 years ago

Thanks for the comments! I also recommend that tfa.image should be compatible with 3-D image because the common use case is to use .map(lambda x: tfa.image.foo(x)) to process a batch of images.

bhack commented 4 years ago

Shape casting: we do to_4D_image and from_4D_image to make most operations compatible with tf.data.Dataset, while they operate only on channel last images. We can indeed extend those functions for channel first images, but I am also confused if it's worthwhile.

@WindQAQ I don't understand in the current status if we need to do this or not cause in https://github.com/tensorflow/addons/pull/1840 we was going to set the shape bypassing the 4d utils.

WindQAQ commented 4 years ago

Shape casting: we do to_4D_image and from_4D_image to make most operations compatible with tf.data.Dataset, while they operate only on channel last images. We can indeed extend those functions for channel first images, but I am also confused if it's worthwhile.

@WindQAQ I don't understand in the current status if we need to do this or not cause in #1840 we was going to set the shape bypassing the 4d utils.

@bhack I suppose we should at least support 3D image as most of functions in tf.image do that. So we can have another PR to fix it.

bhack commented 4 years ago

@WindQAQ What Is supporting the official Vision Model garden refactoring https://github.com/tensorflow/models/blob/master/official/vision/image_classification/augment.py#L37? Do you think that users could mix that with tfa.image or tf.image?

WindQAQ commented 4 years ago

@WindQAQ What Is supporting the official Vision Model garden refactoring https://github.com/tensorflow/models/blob/master/official/vision/image_classification/augment.py#L37? Do you think that users could mix that with tfa.image or tf.image?

When migrating everything to addons (or others repo), it should stick to the rule of addons IMO. So the implementation should be either refactored or discarded.

facaiy commented 4 years ago

Thanks for the comments! I also recommend that tfa.image should be compatible with 3-D image because the common use case is to use .map(lambda x: tfa.image.foo(x)) to process a batch of images.

In most cases, it might be better to use data.batch(xx).map(xxx) for performance reason :-)

bhack commented 4 years ago

You can check Gsoc progress at https://github.com/tanzhenyu/image_augmentation

WindQAQ commented 4 years ago

Reopen this as the data_format will be dropped in Addons 0.12.