Open Rocketknight1 opened 1 year ago
Thanks @Rocketknight1! Is there a specific model you have in mind to port?
@jbischof I can you give an example, GCViT
uses AdaptivePooling
,
https://github.com/NVlabs/GCVit/blob/667c8854f26a0e4bdf39cab8a9dee398a2c4f4ae/models/gc_vit.py#L186
I ported it here and I am using @mattdangerw 's AdaptivePooling
gist,
https://github.com/awsaf49/gcvit-tf/blob/720adf7e43e875daa880ab5588d16612f0f4dd6a/gcvit/layers/feature.py#L67
@jbischof The layer is generally found inside pyramid pooling modules. In :hugs: transformers
we have several models that use it in the PyTorch implementation. All of these need a TF port of the layer for their weights to be usable in TensorFlow:
We also have lots of other uses of it in the codebase that just use it to resize down to (1, 1). In those cases, though, we can just port the op with GlobalAveragePool2D
, or just tf.reduce_mean()
, lol. The three models above are the main ones that actually use the full feature set of AdaptivePool
and that require a full port of the layer to function.
@Rocketknight1 would HuggingFace be interested is using this functionality if we offered it? For example, could take KerasCV as a dependency (as is already the case for KerasNLP)?
Knowing if there is a big customer interested will help us prioritize the work.
@jbischof Likely yes! Right now we've just had to implement TF AdaptivePool in our own codebase, so if there was a proper reference version in KerasCV we'd probably switch to that.
Also, to be clear, I'm happy to make the PR to KerasCV for this!
Awesome, we're excited to work with you!
@Rocketknight1 any update on this?? Actually I require this layer for adding example on keras.io in Keras 3.0
@awsaf49 It totally fell off the radar, unfortunately, because of a pile of other stuff to do in transformers
! Right now our TF efforts are focused on Keras 3.0 compatibility preparation, so I'm not sure when I'll get a chance!
See keras-team/keras#512 and the main Keras issue for some previous discussion on this topic.
AdaptivePool
is a pooling layer in PyTorch that takes input of any size (usually 4D image tensors) and pools down to a fixed output size with non-overlapping pooling windows. It can handle any input shape, even when the output shape does not evenly divide the input shape. It does this by varying the width of the pooling windows.As a worked example, if you ask
AdaptivePool
to resize a dimension of 100 down to 40, it will need to tile 40 non-overlapping windows across those 100 positions, with an average width of 100/40 = 2.5. Since 2.5 is not a valid window size, it will instead tile windows of width (2, 3, 2, 3, ...) across the input, and pool each of those with the pooling function (Average
orMax
).Because of the variable width and the non-overlapping windows, it is impossible to replicate
AdaptivePool
in other frameworks by just carefully choosing the parameters for a normal pooling function. There is an implementation intfa
but it just usestf.split
on the input, which means that fails unless the output size evenly divides the input size.I wrote a correct reimplementation of
AdaptivePool
here. This matches PyTorch's behaviour up to numerical error, although it cannot upsample input (which PyTorch's function can, rather unusually for a pooling function lol). The reason we need to mimic PyTorch here is that if a network is trained with the PyTorch function then we can't use their pretrained weights unless our layers behave in the same way. Several large vision networks (usually ones using a pyramid pooling module) useAdaptivePool2D
in their code.My intuition is that
AdaptivePool
is primarily useful for these kinds of PyTorch ports, and if you wanted to do the same thing while training a network from scratch in TF you would probably just usetf.image.resize
or something, as long as you can differentiate through the resize kernel. Still, I think it would be nice to have in KerasCV to make it possible to port those nets over!cc @awsaf49 @ianstenbit