huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.65k stars 27.16k forks source link

Getting equivalent results between Transformer's resize and tf.image.resize #27601

Open NielsRogge opened 1 year ago

NielsRogge commented 1 year ago

Feature request

For the SigLIP model (#26522), I'd like to get equivalent results between tf.image.resize and the resize method available in Transformers.

Here's what I tried:

from PIL import Image
import requests

import tensorflow as tf
import numpy as np

def resize(image, size, method="bilinear", antialias=False):
    """Resizes image to a given size."""
    # Note: use TF-2 version of tf.image.resize as the version in TF-1 is
    # buggy: https://github.com/tensorflow/tensorflow/issues/6720.
    # In particular it was not equivariant with rotation and lead to the network
    # to learn a shortcut in self-supervised rotation task, if rotation was
    # applied after resize.
    dtype = image.dtype
    tf_dtype = tf.type_spec_from_value(image).dtype
    image = tf.image.resize(image, size, method=method, antialias=antialias)
    return tf.cast(tf.clip_by_value(image, tf_dtype.min, tf_dtype.max), dtype)

# load image
url = 'https://cdn.openai.com/multimodal-neurons/assets/apple/apple-ipod.jpg'
image = Image.open(requests.get(url, stream=True).raw)

# get original pixel values
original_pixel_values = resize(np.array(image), size=(224,224))

# get our pixel values
from transformers.image_transforms import resize

pixel_values = resize(np.array(image), size=(224,224), resample=Image.Resampling.BILINEAR)

# verify results
np.testing.assert_array_equal(original_pixel_values, pixel_values)

This currently fails with:

AssertionError: 
Arrays are not equal

Mismatched elements: 87370 / 150528 (58%)
Max absolute difference: 255
Max relative difference: 255.
 x: array([[[127, 101,  59],
        [136, 112,  72],
        [129, 109,  72],...
 y: array([[[131, 105,  63],
        [138, 114,  74],
        [126, 108,  70],...

Motivation

Would be great to have equivalent results such that logits match with the original implementation.

Your contribution

I provide a notebook here for testing.

NielsRogge commented 1 year ago

cc @amyeroberts