Closed solidDoWant closed 1 month ago
Happens in this minimal example (no keras):
import tensorflow as tf
x = tf.Variable(3, dtype=tf.int32)
y = tf.multiply(x,2)
with tf.GradientTape() as g:
g.watch(x)
dy_dx = g.gradient(y, x)
print(dy_dx)
WARNING:tensorflow:The dtype of the watched tensor must be floating (e.g. tf.float32), got tf.int32 None
Tensorflow needs floats to compute a gradient. So it's not a bug, weights have to be floats.
Hi @solidDoWant -
Thanks for raising this issue. Actually Integer and strings are not differentiable. So there will be no gradients if using those integer or string data type.
For gradients are calculated using floating point algorithm only. TensorFlow doesn't automatically cast between types, because of that you are getting error.
More details regarding gradient with different data type can find here.
Thanks for looking into this ghsanti and mehtamansi29.
Actually Integer and strings are not differentiable. So there will be no gradients if using those integer or string data type.
I thought this came up. Strictly speaking floats are not differentiable either under the same reasoning. Both integers and floats are stored as limited-precision numbers and therefore are discrete values instead of continuous. If integers are not differentiable then floats are not either, and if floats are differentiable then integers are as well. In both cases there is just a large loss of precision for numbers close to zero - it's just a question of what "close" is.
I do get that strings are not differentiable thoug - I don't know mathematically how you would represent the derivative of a string.
Even if gradients for integers isn't implemented (I still think this is a bug but putting that aside), I really think that the error message should relay something more friendly about where the problem is.
Happens in this minimal example (no keras):
I think this is enough to show that my issue is with Tensorflow rather than keras. I'll file an issue there unless there is any reason to believe the issue is with keras.
One last thing - could the docs be updated to mention this limitation? I got stuck on this for quite awhile while, and it might be nice for future users to be aware of this beforehand.
Thinking about the weights' update formula for a simple linear model using MSE:
We want the lr
to be small_ (i.e a float) for optimisation to succeed, plus we have a fraction. So we would need to round that to an integer, for the weights to be updated correctly (using same types.)
On PyTorch (just the forum), they say:
(...) optimizer wants to change one of your network parameters by a small amount
It'd be good if you link the TF Issue once you do it to check what they reply anyways.
Very interesting issue! @solidDoWant
I thought this came up. Strictly speaking floats are not differentiable either under the same reasoning. Both integers and floats are stored as limited-precision numbers and therefore are discrete values instead of continuous. If integers are not differentiable then floats are not either, and if floats are differentiable then integers are as well. In both cases there is just a large loss of precision for numbers close to zero - it's just a question of what "close" is.
Hi @solidDoWant
Integers have limited range. If programs uses integer than it will results of integer computations can be stored in 32 bits. Here most calculation in real numbers so it will produce more quantities. Floating point can be represent wide range of numbers and also have capacity of rounding error as well. While both floats and integers are discrete, floats offer significantly higher precision in representing fractional values. This is essential for calculating gradient. And while doing GPU and TPU computing floating point computation are required because floating point precision we can use via quantization and memory optimization. Here you can find more detail about mathematics behind floating point precision.
Busy couple of weeks... just filed an issue with Tensorflow repo here
I also discovered that even with a custom gradient function, this still fails on int32 without even calling the function
Hi @solidDoWant -
Closing the issue as you created new issue in tensorflow repo. Feel free to reopen the issue if required. Thanks...!!
When using an integer data type for a trainable variable, training will always throw a "No gradients provided for any variable."
ValueError
. Here is a very simple example to reproduce the issue:Unfortunately this error message is vague enough that the root of the issue is unclear. The full error message is:
If it helps, here is what the model looks like:
Full disclosure: I'm not certain if this is a Keras bug or a Tensorflow bug. If this is suspected to be a Tensorflow bug, let me know and I'll file an issue there instead.