keras-team / tf-keras

The TensorFlow-specific implementation of the Keras API, which was the default Keras from 2019 to 2023.
Apache License 2.0
62 stars 28 forks source link

Functional model computes wrong output signature in mixed_fp16 #303

Closed shkarupa-alex closed 1 year ago

shkarupa-alex commented 1 year ago

System information.

Describe the problem.

According to https://www.tensorflow.org/guide/mixed_precision i set last layer activation dtype to float32 for mixed precision policy. But model calculates wrong output signature.

Describe the current behavior.

Even if last layer works in float32, output signature for model is still calculated as float16

Describe the expected behavior.

Model should estimate output signature in the same way as output_shape (layer by layer).

Contributing.

Standalone code to reproduce the issue.

https://colab.research.google.com/drive/1tTpjnOPsamotExpM814o69l58CFjMkcS?usp=sharing

sushreebarsa commented 1 year ago

@SuryanarayanaY I was able to replicate the issue on colab, please find the gist here. Thank you!

SuryanarayanaY commented 1 year ago

I tried with set_global_policy('X') without explicit dtype to the last layer and the model predictions and Output Signature both have dtype as 'X'.But when i set explicit dtype 'Y' to last layer and keeping set_global_policy('X') then model.predictions has dtype 'Y' which is expected but still model.compute_output_signature has dtype as 'X' only.Please refer to attached gist. It seems there is a bug with model.compute_output_signature code.

@shkarupa-alex ,

Thanks for your observation. If you willing to contribute please feel free to raise a PR.

Thanks!

ianstenbit commented 1 year ago

This issue also exists for Sequential models -- it seems that it's not limited to Functional models.

It seems likely that keras.Model should override compute_output_signature to correctly return the dtype of the last layer's output.

Repro with a Sequential model:

import keras
import tensorflow as tf
from keras import layers

model = keras.Sequential([layers.Rescaling(scale=1.0 / 255), layers.Dense(10, activation='softmax', dtype='float32')])
print(model.compute_output_signature(tf.TensorSpec(dtype='uint8', shape=[2, 16, 16, 3])))
print(model(tf.zeros(shape=(2, 16, 16, 3))).dtype)
yamanoko commented 1 year ago

@SuryanarayanaY I guess I found the cause of this issue so I send a pull request about it. Please check if you don't mind. Thanks!

SuryanarayanaY commented 1 year ago

@yamanoko , Thanks for PR. Our Team will have a look into this and will inform if anything more needed.

yamanoko commented 1 year ago

@SuryanarayanaY Thank you!

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No

shkarupa-alex commented 1 year ago

This PR https://github.com/keras-team/keras/pull/17703 fixes issue for single output, but not for multiple. Here is updated code to reproduce https://colab.research.google.com/drive/1tTpjnOPsamotExpM814o69l58CFjMkcS?usp=sharing

shkarupa-alex commented 1 year ago

Here is a fix for multiple outputs https://github.com/keras-team/keras/pull/18123