mrdbourke / tensorflow-deep-learning

All course materials for the Zero to Mastery Deep Learning with TensorFlow course.
https://dbourke.link/ZTMTFcourse
MIT License
5.14k stars 2.53k forks source link

Food 101 training problem library missing #504

Closed msolalh closed 1 year ago

msolalh commented 1 year ago

I have issues trying to train for the millestone project. Colab is complaining about a missing DNN library see below. I looked online. It looks something changed in tensorflow new versions. It looks the libraries change faster than I am following this course... Any idea how to fix that problem? Thanks, Marc

Node: 'model_1/efficientnetb0/stem_conv/Conv2D' DNN library is not found. [[{{node model_1/efficientnetb0/stem_conv/Conv2D}}]] [Op:__inference_train_function_24154]

msolalh commented 1 year ago

I think the issue is that tensorflow is changing. Is there a way to decide which version to use ? Which one is compatible with the code in the course? THanks.

SaketMunda commented 1 year ago

Are you following the same code that is being written in this repo for food101 milestone project ? If yes then you can downgrade to tensorflow==2.7. I tried it and it gives me no error.

If your code is different then kindly post your steps to reproduce your issue.

msolalh commented 1 year ago

The code for food101 milestone project does not work. There is an issue loading EfficientNetB0. I had to change to:

base_model = tf.keras.applications.efficientnet.EfficientNetB0(include_top=False)

But I have a different issue now. Is the code working for you? Thanks, Marc

TypeError Traceback (most recent call last) in 4 input_shape = (224, 224, 3) 5 ----> 6 base_model = tf.keras.applications.efficientnet.EfficientNetB0(include_top=False) 7 base_model.trainable = False # freeze base model layers 8

4 frames /usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs) 65 except Exception as e: # pylint: disable=broad-except 66 filtered_tb = _process_traceback_frames(e.traceback) ---> 67 raise e.with_traceback(filtered_tb) from None 68 finally: 69 del filtered_tb

TypeError: Exception encountered when calling layer "tf.math.truediv_5" (type TFOpLambda).

x and y must have the same dtype, got tf.float16 != tf.float32.

Call arguments received by layer "tf.math.truediv_5" (type TFOpLambda): • x=tf.Tensor(shape=(None, None, None, 3), dtype=float16) • y=tf.Tensor(shape=(3,), dtype=float32) • name=None

SaketMunda commented 1 year ago

@msolalh You need to downgrade the tensorflow version to 2.8 then restart your runtime. Before compiling your model you can check what version of tensorflow you're running on, tf.__version__ It must output 2.8.0 if you've succesfully downgraded the version and restarted the runtime.

msolalh commented 1 year ago

How do downgrade the version? Thanks, Marc


De : Saket Munda @.> Envoyé : jeudi 2 février 2023 03:19 À : mrdbourke/tensorflow-deep-learning @.> Cc : msolalh @.>; Author @.> Objet : Re: [mrdbourke/tensorflow-deep-learning] Food 101 training problem library missing (Issue #504)

You need to downgrade the tensorflow version to 2.8 then restart your runtime. Before compiling your model you can check what version of tensorflow you're running on, tf.version It must output 2.8.0 if you've succesfully downgraded the version and restarted the runtime.

— Reply to this email directly, view it on GitHubhttps://github.com/mrdbourke/tensorflow-deep-learning/issues/504#issuecomment-1413319956, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AXMFS5BPJAZYUXIAZHCGIDLWVNUYNANCNFSM6AAAAAAUCVCHZI. You are receiving this because you authored the thread.Message ID: @.***>

SaketMunda commented 1 year ago

Marc, You can try to run below command in a new cell of a colab,

!pip install tensorflow==2.8

It should start downloading the tensorflow version 2.8 Screenshot 2023-02-03 at 5 57 12 PM

Then when it's done, Restart your Runtime from the Menu Bar, Screenshot 2023-02-03 at 5 58 36 PM

And you successfully Restarted the Runtime, run, import tensorflow as tf print(tf.__version__)

It should print as 2.8.0 Screenshot 2023-02-03 at 6 00 42 PM

msolalh commented 1 year ago

It is working now. It looks tensorflow is changing faster than I am doing this course...Any idea why it is not working with current version of tf? Thanks

SaketMunda commented 1 year ago

I think it is working with 2.11 version, but while saving the model after training produces some issues. It could be also because of combination of mixed precision and EfficientNetBX models is a reason. I'm also new to this thing so I haven't dig deeper much to investigate further, but If I find something I'll post my findings here.

SaketMunda commented 1 year ago

@msolalh Also I would like to ask you to tag this issue as answered if you're satisfied with the above answers/discussions. Thanks

msolalh commented 1 year ago

Thanks to SaketMunda. This is a tf version issue.