For the SigLIP model (#26522), I'd like to get equivalent results between tf.image.resize and the resize method available in Transformers.
Here's what I tried:
from PIL import Image
import requests
import tensorflow as tf
import numpy as np
def resize(image, size, method="bilinear", antialias=False):
"""Resizes image to a given size."""
# Note: use TF-2 version of tf.image.resize as the version in TF-1 is
# buggy: https://github.com/tensorflow/tensorflow/issues/6720.
# In particular it was not equivariant with rotation and lead to the network
# to learn a shortcut in self-supervised rotation task, if rotation was
# applied after resize.
dtype = image.dtype
tf_dtype = tf.type_spec_from_value(image).dtype
image = tf.image.resize(image, size, method=method, antialias=antialias)
return tf.cast(tf.clip_by_value(image, tf_dtype.min, tf_dtype.max), dtype)
# load image
url = 'https://cdn.openai.com/multimodal-neurons/assets/apple/apple-ipod.jpg'
image = Image.open(requests.get(url, stream=True).raw)
# get original pixel values
original_pixel_values = resize(np.array(image), size=(224,224))
# get our pixel values
from transformers.image_transforms import resize
pixel_values = resize(np.array(image), size=(224,224), resample=Image.Resampling.BILINEAR)
# verify results
np.testing.assert_array_equal(original_pixel_values, pixel_values)
This currently fails with:
AssertionError:
Arrays are not equal
Mismatched elements: 87370 / 150528 (58%)
Max absolute difference: 255
Max relative difference: 255.
x: array([[[127, 101, 59],
[136, 112, 72],
[129, 109, 72],...
y: array([[[131, 105, 63],
[138, 114, 74],
[126, 108, 70],...
Motivation
Would be great to have equivalent results such that logits match with the original implementation.
Feature request
For the SigLIP model (#26522), I'd like to get equivalent results between tf.image.resize and the resize method available in Transformers.
Here's what I tried:
This currently fails with:
Motivation
Would be great to have equivalent results such that logits match with the original implementation.
Your contribution
I provide a notebook here for testing.