keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1.01k stars 330 forks source link

Adding AdaptivePool #1880

Open Rocketknight1 opened 1 year ago

Rocketknight1 commented 1 year ago

See keras-team/keras#512 and the main Keras issue for some previous discussion on this topic.

AdaptivePool is a pooling layer in PyTorch that takes input of any size (usually 4D image tensors) and pools down to a fixed output size with non-overlapping pooling windows. It can handle any input shape, even when the output shape does not evenly divide the input shape. It does this by varying the width of the pooling windows.

As a worked example, if you ask AdaptivePool to resize a dimension of 100 down to 40, it will need to tile 40 non-overlapping windows across those 100 positions, with an average width of 100/40 = 2.5. Since 2.5 is not a valid window size, it will instead tile windows of width (2, 3, 2, 3, ...) across the input, and pool each of those with the pooling function (Average or Max).

Because of the variable width and the non-overlapping windows, it is impossible to replicate AdaptivePool in other frameworks by just carefully choosing the parameters for a normal pooling function. There is an implementation in tfa but it just uses tf.split on the input, which means that fails unless the output size evenly divides the input size.

I wrote a correct reimplementation of AdaptivePool here. This matches PyTorch's behaviour up to numerical error, although it cannot upsample input (which PyTorch's function can, rather unusually for a pooling function lol). The reason we need to mimic PyTorch here is that if a network is trained with the PyTorch function then we can't use their pretrained weights unless our layers behave in the same way. Several large vision networks (usually ones using a pyramid pooling module) use AdaptivePool2D in their code.

My intuition is that AdaptivePool is primarily useful for these kinds of PyTorch ports, and if you wanted to do the same thing while training a network from scratch in TF you would probably just use tf.image.resize or something, as long as you can differentiate through the resize kernel. Still, I think it would be nice to have in KerasCV to make it possible to port those nets over!

cc @awsaf49 @ianstenbit

jbischof commented 1 year ago

Thanks @Rocketknight1! Is there a specific model you have in mind to port?

awsaf49 commented 1 year ago

@jbischof I can you give an example, GCViT uses AdaptivePooling, https://github.com/NVlabs/GCVit/blob/667c8854f26a0e4bdf39cab8a9dee398a2c4f4ae/models/gc_vit.py#L186

I ported it here and I am using @mattdangerw 's AdaptivePooling gist, https://github.com/awsaf49/gcvit-tf/blob/720adf7e43e875daa880ab5588d16612f0f4dd6a/gcvit/layers/feature.py#L67

Rocketknight1 commented 1 year ago

@jbischof The layer is generally found inside pyramid pooling modules. In :hugs: transformers we have several models that use it in the PyTorch implementation. All of these need a TF port of the layer for their weights to be usable in TensorFlow:

Rocketknight1 commented 1 year ago

We also have lots of other uses of it in the codebase that just use it to resize down to (1, 1). In those cases, though, we can just port the op with GlobalAveragePool2D, or just tf.reduce_mean(), lol. The three models above are the main ones that actually use the full feature set of AdaptivePool and that require a full port of the layer to function.

jbischof commented 1 year ago

@Rocketknight1 would HuggingFace be interested is using this functionality if we offered it? For example, could take KerasCV as a dependency (as is already the case for KerasNLP)?

Knowing if there is a big customer interested will help us prioritize the work.

Rocketknight1 commented 1 year ago

@jbischof Likely yes! Right now we've just had to implement TF AdaptivePool in our own codebase, so if there was a proper reference version in KerasCV we'd probably switch to that.

Also, to be clear, I'm happy to make the PR to KerasCV for this!

jbischof commented 1 year ago

Awesome, we're excited to work with you!

awsaf49 commented 1 year ago

@Rocketknight1 any update on this?? Actually I require this layer for adding example on keras.io in Keras 3.0

Rocketknight1 commented 1 year ago

@awsaf49 It totally fell off the radar, unfortunately, because of a pile of other stuff to do in transformers! Right now our TF efforts are focused on Keras 3.0 compatibility preparation, so I'm not sure when I'll get a chance!