Open Reisa14 opened 2 years ago
Hi @Reisa14 , Thanks for your feedback. Sadly, I can't reproduce this issue, when I run the notebook locally or on Colab, I don't get this warning at all. Could you please run the following code and copy/paste the output here?
from sklearn import show_versions
import tensorflow as tf
show_versions()
print(tf.__version__)
print(tf.config.list_physical_devices())
Hi @ageron, this is the output that I get (I should also add that I am using Kaggle with the GPU enabled):
System:
python: 3.7.10 | packaged by conda-forge | (default, Sep 13 2021, 19:43:44) [GCC 9.4.0]
executable: /opt/conda/bin/python
machine: Linux-5.10.68+-x86_64-with-debian-buster-sid
Python dependencies:
pip: 21.2.4
setuptools: 58.0.4
sklearn: 0.23.2
numpy: 1.19.5
scipy: 1.7.1
Cython: 0.29.24
pandas: 1.3.4
matplotlib: 3.4.3
joblib: 1.0.1
threadpoolctl: 2.2.0
Built with OpenMP: True
2.6.0
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2021-12-11 11:46:23.368204: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-11 11:46:23.498033: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-11 11:46:23.498810: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
Thanks @Reisa14. Mmmh, this might be a TF bug, I see nothing wrong with your code. Could you please file a bug with TensorFlow?
Hi, I'm having the same issue. did you find a solution?
In my case, the problem was that the batch size was 1 and the batch normalization layer was used. Removing this layer solved the problem. Also, increasing the batch size solved the problem, but it causes an increase in gpu memory consumption.
Running the code from the notebook for the DCGAN produces the reconstructed images after each epoch but only after saying that the optimization loop has repeatedly failed. The happens on every epoch and is slowing it down a lot. What am I missing? Thanks!
To Reproduce
Exception
Expected behavior The run produces the reconstructed images but only after saying that the optimization loop failed and it does this for every epoch so the run takes a long time.
Versions
Additional context This is the code for the generator and discriminator:
And here's the code for the train_gan function: