jonasrauber / eagerpy

PyTorch, TensorFlow, JAX and NumPy — all of them natively using the same code
https://eagerpy.jonasrauber.de
MIT License
693 stars 39 forks source link

Does a universal function can be compiled in tensorflow? #36

Open eserie opened 3 years ago

eserie commented 3 years ago

Let's consider a simple compiled function in tensorflow.

import tensorflow as tf
a = tf.random.normal(shape=(2, 10))
b = tf.random.normal(shape=(10, 3))

@tf.function
def tf_compiled_func(a, b):
    c = tf.matmul(a, b)
    return c

tf_compiled_func(a, b)

This bunch of code works.

However, its "universal" version :

@tf.function
def compiled_universal_func(a, b):
    a, b = ep.astensors(a, b)
    c = a.matmul(b)
    return c.raw

a = tf.random.normal(shape=(2, 10))
b = tf.random.normal(shape=(10, 3))
compiled_universal_func(a, b)

does not work and raises the error:

.../lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    975           except Exception as e:  # pylint:disable=broad-except
    976             if hasattr(e, "ag_error_metadata"):
--> 977               raise e.ag_error_metadata.to_exception(e)
    978             else:
    979               raise

AttributeError: in user code:

    <ipython-input-8-6edbe80953ee>:4 compiled_universal_func  *
        c = a.matmul(b)
    .../lib/python3.7/site-packages/eagerpy/tensor/tensorflow.py:499 matmul  *
        if self.ndim != 2 or other.ndim != 2:
    .../lib/python3.7/site-packages/eagerpy/tensor/base.py:115 ndim
        return cast(int, self.raw.ndim)

    AttributeError: 'Tensor' object has no attribute 'ndim'

(but it works if we comment the @tf.function)

Let's notice that the equivalent thing with jax seems to work:

import jax
from jax import jit

@jit
def compiled_universal_func(a, b):
    a, b = ep.astensors(a, b)
    c = a.matmul(b)
    return c.raw

seed = 1701
key = jax.random.PRNGKey(seed)
a = jax.random.normal(shape=(2, 10), key=key)
b = jax.random.normal(shape=(10, 3), key=key)
compiled_universal_func(a, b)

Is it a problem with the integration of eagerpy with tensorflow ?

jonasrauber commented 3 years ago

Well, I would say, it's mostly a bug in TensorFlow because it doesn't support ndim in compiled functions:

Calling this fails:

@tf.function
def f(a):
    return a.ndim

Calling this works:

@tf.function
def f(a):
    return len(a.shape)

We run into this problem, because our matmul implementation does an additional dimensionality check using ndim.

jonasrauber commented 3 years ago

@eserie I filed a bug in the TensorFlow repository. Let's see what they think. If you need a temporary workaround, you can comment out the shape checks that use ndim or replace them with calls to shape.

eserie commented 3 years ago

Thank you very much to have posted the issue in TensorFlow repository! Do you think it could be worth to have the shape implementation in eagerpy in order to be compatible with more versions of TensorFlow? We could come back later to the canonical ndim implementation once it’s corrected in TF?

Another remark, if compilation makes sens in eagerpy, we could made it available in a universal way through an argument ‘compile=True’ in ‘eager_function’ proposed in #34. What do you think about that ?

jonasrauber commented 3 years ago

I have to say, I haven't really thought enough about compilation and I am not sure it can be abstracted away enough to unify it between TensorFlow, PyTorch, and JAX. I think it could be interesting, but it requires careful testing of all the special cases and limitations.

jonasrauber commented 3 years ago

Thanks to #40 this is resolved, but I'll leave this issue open for now, while the TensorFlow project discusses what to do about it.