keras-team / keras-nlp

Modular Natural Language Processing workflows with Keras
Apache License 2.0
758 stars 227 forks source link

keras-nlp insists I use the (buggy) Tensorflow 2.16.1 which does not work with my GPU #1519

Closed nas-mouti closed 4 months ago

nas-mouti commented 5 months ago

Describe the bug

The latest Tensorflow 2.16.1 has a bug, it doesn't seem to detect GPUs (see for example: http://127.0.0.1:8888/lab?token=59ba515252bb7306a955696efe83ad0b816e730b847fac69

To get around that, I ran pip install tensorflow[and-cuda]==2.15.1

It worked and my GPU was detected.

The problem is when I pip install keras-nlp, it tries to uninstall tensorboard, tensorflow etc. to install their latest versions. I suspect keras does the same too.

I tried pip install keras-nlp --no-deps and got errors during imports (such as ModuleNotFoundError: No module named 'keras_core' (The preinstalled keras version was 2.15.0)

I tried pip uninstall keras keras-nlp then pip install keras==3.0.0 keras-nlp==0.6.3 --no-deps and I got import errors again, such as ModuleNotFoundError: No module named 'rich'

To Reproduce

Here's a colab link. The aspect that's not reproduced is tensorflow not recognizing the GPU. The GPU issue seems to be restricted to PCs.

Expected behavior

I would like to be able to install functional keras and keras-nlp packages with Tf 2.15.1

Additional context

PC windows 11 with WSL2 Ubuntu

nvidi-smi

Thu Mar 21 16:03:26 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.120                Driver Version: 537.58       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        On  | 00000000:01:00.0  On |                  Off |
|  0%   33C    P8              16W / 450W |   2895MiB / 24564MiB |      3%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
sachinprasadhs commented 5 months ago

I was able to detect GPU in the colab GPU vm.

I followed the below list of commands. Create a fresh environment and try.

!pip install -U keras-nlp
!pip install -U tensorflow

import tensorflow as tf
print(tf.config.list_physical_devices('GPU'))

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

If you are still unable to detect GPU, you can close this issue and create a new issue in the TensorFlow repo since it is related to TensorFlow.

nas-mouti commented 5 months ago

Yes it works on Github, I reproduced it as well (see my link). But it doesn't work on my PC.

On Thu, Mar 21, 2024, 5:17 PM Sachin Prasad @.***> wrote:

I was able to detect GPU in the colab GPU vm.

I followed the below list of commands. Create a fresh environment and try.

!pip install -U keras-nlp !pip install -U tensorflow import tensorflow as tfprint(tf.config.list_physical_devices('GPU'))

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

If you are still unable to detect GPU, you can close this issue and create a new issue in the TensorFlow repo since it is related to TensorFlow.

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/1519#issuecomment-2013753567, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMES55VEMZIVRJMVFWZIXKDYZNE5DAVCNFSM6AAAAABFCD5KAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJTG42TGNJWG4 . You are receiving this because you authored the thread.Message ID: @.***>

sachinprasadhs commented 5 months ago

You link is localhost runtime, we can't access it.

nas-mouti commented 5 months ago

My apologies, I'm not very familiar with colab.

I'll update the post.

Here's the updated link: https://colab.research.google.com/drive/1BIQpgHH_Ri7HzXIykt34cRDEQ9-Yodv1?usp=sharing

On Thu, Mar 21, 2024, 6:25 PM Sachin Prasad @.***> wrote:

You link is localhost runtime, we can't access it.

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/1519#issuecomment-2013956658, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMES55UG5P7GAPCRBC2FQO3YZNM5LAVCNFSM6AAAAABFCD5KAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJTHE2TMNRVHA . You are receiving this because you authored the thread.Message ID: @.***>

sachinprasadhs commented 5 months ago

I can see Num GPUs Available: 1 in your colab, what is the issue again?

nas-mouti commented 5 months ago

That's just a colab I made because the guidelines asked me to. I'm trying to run this on my own PC, not the colab.

On Thu, Mar 21, 2024, 7:12 PM Sachin Prasad @.***> wrote:

I can see Num GPUs Available: 1 in your colab, what is the issue again?

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/1519#issuecomment-2014018641, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMES55TC22OZRSEQIUTCPULYZNSNNAVCNFSM6AAAAABFCD5KAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJUGAYTQNRUGE . You are receiving this because you authored the thread.Message ID: @.***>

sachinprasadhs commented 5 months ago

Got it, Looks like this is the TensorFlow issue for the specific OS. You can create a new issue in TensorFlow and link this issue for context.

arsenstonelab commented 5 months ago

I think it would be nice if keras-nlp was usable by older tensorflow versions, as 2.16.1 version has several bugs in it, and keras-nlp seems to be compatible with tf 2.15

sachinprasadhs commented 5 months ago

We always try to match the latest TensorFlow version during the time of release, it's the same practice we follow for Keras-Cv as well. Moreover, TensorFlow 2.16.1 uses Keras 3 as a backend unlike 2.15 version which uses Keras 2 as a backend.

mattdangerw commented 5 months ago

@arsenstonelab thanks for the issue. This is indeed a bit of a rough edge. The issue is actually with tensorflow-text most likely. keras-nlp is unopinionated about tensorflow versions in our package setup, but if you install keras-nlp it will try to install tensorflow-text (the latest version if none is installed). Which in turn will try to install the latest tensorflow version. Which can lead to a big upgrade of tensorflow.

One option is to pin the tf version you want during install for both tensorflow-text and tensorflow. E.g. this works for installing keras-nlp with tf 2.15. pip install keras-nlp tensorflow-text~=2.15.0 tensorflow~=2.15.0

Does that work for you?

github-actions[bot] commented 4 months ago

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 4 months ago

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.