Open Lw-Cui opened 2 weeks ago
Hi @Lw-Cui -
Thanks for reporting the issue. I am able to reproduce this issue. But here while using os.environ["TF_USE_LEGACY_KERAS"] = "1"
, it means you are trying to use Keras2 with Tensorflow2.16+ version. And there is some custom layers with Tensorflow API so it is usually easy to convert the code to be backend-agnostic and os.environ["TF_USE_LEGACY_KERAS"] = "0"
it means work as keras3 version with tensorflow backend.
There is some legacy features changes while migrating from Keras2 (tf.keras) to Keras3 (multibackend- keras on the top of tensorflow,jax and pytorch backend).
Here you can find more details about legacy features and Transitioning to backend-agnostic Keras 3.
Because of legacy features and Transitioning to backend-agnostic, it will impact accuracy while for Keras3 os.environ["TF_USE_LEGACY_KERAS"] = "0"
. For Keras3 os.environ["TF_USE_LEGACY_KERAS"] = "0"
, you can train the model with more epochs will increase the accuracy.
Let me know if you required more details. Thanks..!!
Thank you for the explanation. I now understand that converting the code to backend-agnostic Keras 3 is the best approach. However, for learning purposes, could you explain why, despite the code remaining the same, there is such a significant drop in accuracy when transitioning from Keras 2 to Keras 3? What are the specific reasons behind this?
Hi @Lw-Cui -
Here you can find more details about legacy features and Transitioning to backend-agnostic Keras 3.
Because of this removal of features, changes in architecture, layers, metrics in keras3(high level API contains 3 backend-tensorflow, jax, torch) can impact on drop in accuracy. You can re-train the model with more epochs, changing hyperparamter.
Summary
There is a significant degradation in model performance when changing the
TF_USE_LEGACY_KERAS
environment variable between Keras 2 and Keras 3 in an Encoder-Decoder Network for Neural Machine Translation. Withos.environ["TF_USE_LEGACY_KERAS"] = "1"
(Keras 2), the validation set accuracy is much higher (60% v.s. 10%) compared to whenos.environ["TF_USE_LEGACY_KERAS"] = "0"
(Keras 3), despite no changes in the model architecture or training procedure.System Information:
Steps to Reproduce:
os.environ["TF_USE_LEGACY_KERAS"] = "1"
to use Keras 2 and run the Encoder-Decoder model.os.environ["TF_USE_LEGACY_KERAS"] = "0"
to use Keras 3 and run the same model.Expected Results:
The validation accuracy should remain consistent between both runs, or at least be comparable.
Looking for guidance on the cause of this discrepancy and possible ways to resolve this performance issue.