tensorflow / compression

Data compression in TensorFlow
Apache License 2.0
860 stars 251 forks source link

TFC disables Google Colab's GPU #95

Closed alicekore closed 2 years ago

alicekore commented 3 years ago

Describe the bug Every time I upload tensorflow-compression package to colab, I lose connection to GPU, but it works perfectly without tfc.

To Reproduce

  1. Set up Hardware accelerator in Settings to GPU
  2. Run
    !pip install tensorflow-compression
    import tensorflow as tf
    device_name = tf.test.gpu_device_name()
    if device_name != '/device:GPU:0':
    raise SystemError('GPU device not found')
    print('Found GPU at: {}'.format(device_name))

    Output:

    SystemError: GPU device not found

    Expected behavior /device:GPU:0 should be found.

System (please complete the following information):

Additional context I know that GPUs on Colab are not always available for users. I waited for several hours, still have the same issue. I ran the code multiple times and every time I get GPU available without ftc and no device with tfc. Also, I noticed that tfc and tensorflow conflict on pip, you can't install them both (no need though, tfc already has ft inside). Do you have any ideas how to fix it? Without GPU training time is very slow. I tried !python ms2020.py train --epochs=100 --steps_per_epoch=10 and it took me almost 10 hours to train the model. Maybe I am doing something wrong or there are additional settings I'm missing? Thanks!

alicekore commented 3 years ago

According to this discussion, issue should be fixed in v2.5. Switching to v2.5 didn't help, but GPU works with v1.3:

%tensorflow_version 1.x
!pip install tensorflow-compression==1.3

Still, it's not going to help me because entropy coding is not available in old version

yifeipet commented 3 years ago

I just put another request to ask them to change the codes to latest Tensorflow version. Hope they change soon.

jonarchist commented 3 years ago

Hi, TensorFlow has not released a custom-op Docker image for TF 2.6 as they used to. This prevents us from building a more recent pip package for TFC. We are working with the TF team to try to resolve this. Will keep you posted. In the meantime, unfortunately it seems there is no good way to use TFC with GPUs in a Colab environment. Note you can still use GPUs if you install TF 2.5 with the latest version of TFC on your own machine.

jonarchist commented 2 years ago

Update: Sorry this has not been fixed yet. I will probably have some time around the end of the year to overhaul our build system. Will keep you posted.

jonarchist commented 2 years ago

Ok, we are making progress on the new build system. It should be coming within the next few days. We will also transition to a new versioning scheme: We'll adopt the same versioning as TensorFlow to make it easier to identify which TFC release works with which TF release.

yifeipet commented 2 years ago

Hello Dr. Balle, I installed tensorflow-compression on Colab. Although the unitest works fine, Colab still does not support tensorflow-compression. I tested GDN. Error message: (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

Let me try to install another Tensorflow 2.5 and tensorflow-compression on a lab workstation. It has Tensorflow 2.4 installed. But I still hope to see Colab can use tensorflow_compression since installing Tensorflow is not easy. Thank you!

Best Regards, Yifei

jonarchist commented 2 years ago

Hi Yifei, once we finally have the new build system in place, we can make pip packages that correspond to the current TF version (or whichever is used in Colab). Then GPU colabs will be supported again as well.

jonarchist commented 2 years ago

We just released TFC 2.7.0. This should solve the problem. Thanks for waiting!