Closed innat closed 1 year ago
can we use float32
instead of bfloat16
in keras.mixed_precision.set_global_policy("mixed_bfloat16") ?
Why I would do that if I want to enable mixed precision in TPU?
Why I would do that if I want to enable mixed precision in TPU?
you're right. in that case can we try converting data before and after passing it to the data augmentation layers?
Why I would do that if I want to enable mixed precision in TPU?
you're right. in that case can we try converting data before and after passing it to the data augmentation layers?
IMO, not a good design, this should be supported by the API.
Hi @innat ,
Since you mentioned TF version 2.4.1v, Can you confirm whether the error persists in latest versions ? I tried the code snippet with TF2.11v and there is no such error arises.Please refer to attached gist-2.11v.Can you confirm whether provided code snippet by you is enough to replicate the mentioned behaviour?Or shall we need some more data?
I tried with TF2.4.1 but it seems the path is not correct as per gist-2.4.1 and hence not exactly replicated the reported behaviour.
However since 2.4.1 version is not actively supported and also there seems no problem with TF2.11v there is very little chance(almost nill) that this will be cherry picked for 2.4.1. So my request you to please confirm whether the reported issue persists in latest versions ?
@SuryanarayanaY Thanks for the response. I tried to start TPU on colab but each time it shows me No backend with TPU available.. However, I could test the code on kaggle TPU but it still provides TF 2.4.1. Here is the gist that I ran on TPU (TF 2.4.1).
Hi @innat ,
Thanks for confirmation and the repro code.I tested the code with tf2.4.1 version I couldn't found any such error in my gist.
I have also tested the code with TF2.11v and code executes fine without error.Please refer to attached gist-tf2.11v.
Whether this code snippet enough to replicate the error ? If so then it is probably resolved already as my colab gist is success in running the code without error.Not sure of Kaggle environment though.Could you please check and confirm ?
Thank you!
@SuryanarayanaY Thanks for checking. In your gist, the code is incomplete, that's why you didn't face the reported error. I'm placing the full code here.
from packaging.version import parse
import tensorflow as tf
from tensorflow import keras
resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
strategy = tf.distribute.TPUStrategy(resolver)
print("All devices: ", tf.config.list_logical_devices('TPU'))
tf.keras.mixed_precision.set_global_policy("mixed_bfloat16")
tf.__version__
if parse(tf.__version__) < parse('2.6.2'):
from tensorflow.keras.layers.experimental.preprocessing import Resizing
from tensorflow.keras.layers.experimental.preprocessing import RandomFlip
from tensorflow.keras.layers.experimental.preprocessing import RandomZoom
from tensorflow.keras.layers.experimental.preprocessing import RandomRotation
else:
from tensorflow.keras.layers import Resizing
from tensorflow.keras.layers import RandomFlip
from tensorflow.keras.layers import RandomZoom
from tensorflow.keras.layers import RandomRotation
INP_SIZE = 224
# Preprocessing
data_preprocessing = keras.Sequential(
[
Resizing(
*[INP_SIZE] * 2,
interpolation="bilinear"
),
],
name='PreprocessingLayers'
)
# Augmentation
data_augmentations = keras.Sequential(
[
RandomFlip("horizontal"),
# Doesn't work on TPU with mixed precision (TF 2.4.1)
# ticket: https://github.com/keras-team/tf-keras/issues/278
RandomZoom(0.2, fill_mode='reflect'),
RandomRotation(0.1, fill_mode='reflect'),
],
name='AugmentationLayers'
)
# Define Sequential model with 3 layers
model = keras.Sequential(
[
keras.layers.InputLayer(input_shape=tuple([INP_SIZE] * 2) + (3,)),
data_preprocessing,
data_augmentations,
keras.applications.EfficientNetB0(include_top=False, pooling='avg')
]
)
# Call model on a test input
x = tf.ones((1, INP_SIZE, INP_SIZE, 3))
y = model(x)
y.shape
Hi @innat ,
Iam getting different error as below and also refer attached gist.Could you please provide exact code snippet to replicate the reported behaviour.
NotImplementedError: Cannot convert a symbolic Tensor (AugmentationLayers/random_zoom/zoom_matrix/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported
@SuryanarayanaY
Sorry for the inconvenience. The model build should be as follows. Please re-check.
with strategy.scope():
model = ...
@SuryanarayanaY I've run the gist on colab (with tf 2.11 on TPU). The error is reported. Please check
Hi @innat , I am able to replicate the issue with Tf2.11v.Please refer to attached gist.
Hi @innat ,
It seems tf.random_zoom
is not supported on TPU which is the reason for the error.Please refer to source for list of Tensorflow Ops that supported on TPU. tf.random_zoom
is not in the list.
Thanks for the hint. I didn't check all random_*.
(shallow cmt), I think there is not api like tf.random_zoom
, but it's contructed. WDYT?
This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.
.
Hi @innat ,
Sorry for being not specific.Most of the Image augmentation layers one way or other uses Ops from tf.image
module or other TF ops internally. If these Ops were not supported on TPU then the layer also likely to fail on TPU. I haven't specified complete path earlier but nevertheless there is no Op random_zoom
even in tf.image
module but it is a Constructed Layer only that not supported on TPU.
As per the source attached above comment-1446771123 None of the layers of tf.keras.layers.Random* included in the TPU supported layers.
@SuryanarayanaY , Actually these augmentaiton layers works on TPU without any issue until the mixed precisoin (bfloat16
) is enabled. So, these ops are supported on TPU but not with mixed precisoin. WDYT?
Cc'ing @reedwm - Reed do you have a suggestion how to move forward with this?
System information.
Describe the problem.
With mixed precisoin technique in TPU, the following random agumentation layers from keras raised errorr.
Contributing.
Standalone code to reproduce the issue.
Source code / logs.
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.