keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
62.05k stars 19.48k forks source link

Transfer learning tutorial doesn't work with pytorch backend #20287

Closed off6atomic closed 1 month ago

off6atomic commented 1 month ago

If you run this code with backend set to "torch" https://keras.io/guides/transfer_learning/

import os
os.environ["KERAS_BACKEND"] = "torch"

you will get error in the following cell:

for images, labels in train_ds.take(1):
    plt.figure(figsize=(10, 10))
    first_image = images[0]
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        augmented_image = data_augmentation(np.expand_dims(first_image, 0))
        plt.imshow(np.array(augmented_image[0]).astype("int32")) # error occurs here
        plt.title(int(labels[0]))
        plt.axis("off")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-12-3402074d1a3a>](https://localhost:8080/#) in <cell line: 1>()
      5         ax = plt.subplot(3, 3, i + 1)
      6         augmented_image = data_augmentation(np.expand_dims(first_image, 0))
----> 7         plt.imshow(np.array(augmented_image[0]).astype("int32"))
      8         plt.title(int(labels[0]))
      9         plt.axis("off")

[/usr/local/lib/python3.10/dist-packages/torch/_tensor.py](https://localhost:8080/#) in __array__(self, dtype)
   1081             return handle_torch_function(Tensor.__array__, (self,), self, dtype=dtype)
   1082         if dtype is None:
-> 1083             return self.numpy()
   1084         else:
   1085             return self.numpy().astype(dtype, copy=False)

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

How do we fix this? This error happens on Colab and also on local machine.

ghsanti commented 1 month ago

Is the augmented image (or the first image) a tensor in GPU? @off6atomic

If it is, can you do .cpu() it ?

mehtamansi29 commented 1 month ago

Hi @off6atomic -

Can you please let me know which keras version you are getting error ? I ran the same code by setting backend set to "torch" and it is running fine on keras 3.5.0. And are you running this only GPU or CPU ?

Attached gist for the reference.

off6atomic commented 1 month ago

With GPU, Keras 3.4.1.

But there is a problem with your gist. You setup backend variable after importing keras. You need to setup backend variable before importing keras.

After you do that, you will face the same error even with Keras 3.5.0.

image
mehtamansi29 commented 1 month ago

Hi @off6atomic -

As per mention in your code snipper, for pytorch backend getting error TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

As per the error pytorch tensor which are in cuda:0 and Numpy array reside in CPU. So as mentioned here you can use augmented_image[0].cpu() to move pytorch tensor from GPU into CPU.

for images, labels in train_ds.take(1):
    plt.figure(figsize=(10, 10))
    first_image = images[0]
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        augmented_image = data_augmentation(np.expand_dims(first_image, 0))
        plt.imshow(np.array(augmented_image[0].cpu()).astype("int32"))
        plt.title(int(labels[0]))
        plt.axis("off")

Attached gist having running entire code for reference.

off6atomic commented 1 month ago

@mehtamansi29 It works! But as a user I expect that the same code would work with all the backends. Is this expectation invalid here?

mehtamansi29 commented 1 month ago

Hi @off6atomic-

But as a user I expect that the same code would work with all the backends. Is this expectation invalid here?

The code and description at here https://keras.io/guides/transfer_learning/ is for keras API which contains all backends in it.

off6atomic commented 1 month ago

OK. I mean that if I change the backend to torch, I expect that I don't need to change anything else in the code for it to work.

In this case, we have to convert the tensor to cpu before it works. It's not a big hurdle though. Just wanted to know whether users are expected to know that they have to convert tensor to CPU in pytorch case and doesn't have to do it in tensorflow case.

mehtamansi29 commented 1 month ago

Hi @off6atomic -

Just wanted to know whether users are expected to know that they have to convert tensor to CPU in pytorch case and doesn't have to do it in tensorflow case.

Yes. We have to convert tensor to CPU in pytorch case and doesn't have to do it in tensorflow case.

off6atomic commented 1 month ago

Is this difference in behavior documented somewhere?

mehtamansi29 commented 1 month ago

Hi @off6atomic -

That is not documented somewhere but from here you can find if cuda is available torch tensor is in GPU. And from the error TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first also get that not able to convert cuda:0 device type tensor means torch tensor need to convert to numpy for CPU.

off6atomic commented 1 month ago

@mehtamansi29 Then there's probably no issue with the tutorial I guess. We can close it. Thank you!

google-ml-butler[bot] commented 1 month ago

Are you satisfied with the resolution of your issue? Yes No