Open jhoydis opened 1 day ago
Hi @jhoydis -
Thanks for reporting the issue. I am not able to reproduce any kernel crash using complex-valued gradient with Keras 3.7 and TF 2.18 version. Attached gist for your reference.
Hi @mehtamansi29,
Thanks for looking into this so rapidly.
When I run this code on GPU (using Colab and a T4 GPU instance) the kernel crashes.
Hi @jhoydis -
I am also reproduce this issue on GPU (using Colab and a T4 GPU instance). After seeing logs from crash runtime it seems that TensorFlow is overriding a memory allocation setting due to the TF_FORCE_GPU_ALLOW_GROWTH environment variable being set.
And also TensorFlow build might be missing some CPU optimization flags.
I0000 00:00:1733326291.211008 13282 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13949 MB memory: -> device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5
2024-12-04 15:31:31.210156: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
We will dig into the issue and update here.
TensorFlow is able to correctly compute gradients for complex-valued variables. However, the Keras3 optimizers do not seem to be able to correctly apply complex-valued gradients. This worked with Keras 2.
Here is a code snippet that works in TF2.15, but leads to a Kernel crash with Keras 3.7 and TF 2.18. The crash is caused by the function
optimizer.apply_gradients
.