Model Accuracy Degradation by 6x when Switching TF_USE_LEGACY_KERAS from "1" (Keras 2) to "0" (Keras 3)

Lw-Cui commented 2 weeks ago

Summary

There is a significant degradation in model performance when changing the TF_USE_LEGACY_KERAS environment variable between Keras 2 and Keras 3 in an Encoder-Decoder Network for Neural Machine Translation. With os.environ["TF_USE_LEGACY_KERAS"] = "1" (Keras 2), the validation set accuracy is much higher (60% v.s. 10%) compared to when os.environ["TF_USE_LEGACY_KERAS"] = "0" (Keras 3), despite no changes in the model architecture or training procedure.

System Information:

Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
TensorFlow version: 2.17.0
Keras version: 3.4.1
Environment: Google Colab

Steps to Reproduce:

Set os.environ["TF_USE_LEGACY_KERAS"] = "1" to use Keras 2 and run the Encoder-Decoder model.
Set os.environ["TF_USE_LEGACY_KERAS"] = "0" to use Keras 3 and run the same model.
Compare the validation accuracy between the two setups.

Expected Results:

The validation accuracy should remain consistent between both runs, or at least be comparable.

Looking for guidance on the cause of this discrepancy and possible ways to resolve this performance issue.

mehtamansi29 commented 1 week ago

Hi @Lw-Cui -

Thanks for reporting the issue. I am able to reproduce this issue. But here while using os.environ["TF_USE_LEGACY_KERAS"] = "1", it means you are trying to use Keras2 with Tensorflow2.16+ version. And there is some custom layers with Tensorflow API so it is usually easy to convert the code to be backend-agnostic and os.environ["TF_USE_LEGACY_KERAS"] = "0" it means work as keras3 version with tensorflow backend.

There is some legacy features changes while migrating from Keras2 (tf.keras) to Keras3 (multibackend- keras on the top of tensorflow,jax and pytorch backend).

Here you can find more details about legacy features and Transitioning to backend-agnostic Keras 3.

Because of legacy features and Transitioning to backend-agnostic, it will impact accuracy while for Keras3 os.environ["TF_USE_LEGACY_KERAS"] = "0". For Keras3 os.environ["TF_USE_LEGACY_KERAS"] = "0", you can train the model with more epochs will increase the accuracy.

Let me know if you required more details. Thanks..!!

Lw-Cui commented 1 week ago

Thank you for the explanation. I now understand that converting the code to backend-agnostic Keras 3 is the best approach. However, for learning purposes, could you explain why, despite the code remaining the same, there is such a significant drop in accuracy when transitioning from Keras 2 to Keras 3? What are the specific reasons behind this?

mehtamansi29 commented 1 week ago

Hi @Lw-Cui -

Here you can find more details about legacy features and Transitioning to backend-agnostic Keras 3.

Because of this removal of features, changes in architecture, layers, metrics in keras3(high level API contains 3 backend-tensorflow, jax, torch) can impact on drop in accuracy. You can re-train the model with more epochs, changing hyperparamter.

keras-team / keras