leondgarse / keras_efficientnet_v2

self defined efficientnetV2 according to official version. Including converted ImageNet/21K/21k-ft1k weights.
Apache License 2.0
78 stars 19 forks source link

Perfoming way worse than EffnetV1 #8

Closed macsunmood closed 2 years ago

macsunmood commented 3 years ago

The issue I have is that, for example, with the same config EffnetV2_B0 performs way worse then original EfficentnetB0 from TF.applications (~0.70 acc vs. ~0.90). Any quick/obvious reason for that or is more detailed info needed?

macsunmood commented 3 years ago

Also, I get the following warning at the start of V2 traning, could this be assosiated with the issue?

Epoch 1/30

Epoch 00001: LearningRateScheduler reducing learning rate to 0.0010000000474974513.
WARNING:tensorflow:AutoGraph could not transform <function Model.make_train_function.<locals>.train_function at 0x7f772b2fff28> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: closure mismatch, requested ('self', 'step_function'), but source function had ()
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <function Model.make_train_function.<locals>.train_function at 0x7f772b2fff28> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: closure mismatch, requested ('self', 'step_function'), but source function had ()
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
macsunmood commented 3 years ago

UPD: I just tried an equivalent model from https://github.com/sebastian-sz/efficientnet-v2-keras and it performs just well. I'm doing transfer learning and using include_top=False parameter for it. In leondgarse/keras_efficientnet_v2 I'm using num_classes=0 for this purpose.

Still wonder what it has to do with, but to me it looks like there are possibly some bugs with the current implementation.

leondgarse commented 3 years ago

My bad, it's because I accidentally changed a layer name, makes loading skipped on that layer... Sorry for that.. You may try again by upgrade pip install -U keras-efficientnet-v2. Speaking of EfficentnetB0 from TF.application, it has a Rescaling and Normalization layer on the top, so we may use value in [0, 255] as inputs, but [0, 1] values for efficientnetV2 models.

macsunmood commented 3 years ago

My bad, it's because I accidentally changed a layer name, makes loading skipped on that layer... Sorry for that.. You may try again by upgrade pip install -U keras-efficientnet-v2. Speaking of EfficentnetB0 from TF.application, it has a Rescaling and Normalization layer on the top, so we may use value in [0, 255] as inputs, but [0, 1] values for efficientnetV2 models.

The maintainer of the https://github.com/sebastian-sz/efficientnet-v2-keras implementation says that, quote:

The models expect image values in range -1:+1.

You say the v2 models expect values in range [0, 1]. Which one is true?

Also, I still have the WARNING:tensorflow:AutoGraph could not transform <function Model.make_train_function issue :(. Maybe it's not related to your implementation and has to do with something on my side..

leondgarse commented 3 years ago

Yes, yes, it's [-1, 1], he's right! I made some mistake in my testing, ran tf.expand_dims(imm / 127 - 127.5, 0) as input, which results bad, makes me thought it should be [0, 1]... I'm using TF 2.6.0 in my projects, and didn't meet that AutoGraph warning. Maybe version related? Both sebastian-sz and mine using only typical keras layers, and return a typical keras model. Maybe the optimizer?

macsunmood commented 3 years ago

I see you've added include_preprocessing parameter, which is great, but you're using keras.layers.Rescaling(1.0 / 255.0) whereas at https://www.tensorflow.org/api_docs/python/tf/keras/layers/Rescaling it's said that in order to rescale to [-1, 1] range one should pass the following: scale=1./127.5, offset=-1.

As far as I understand, your version is not quite correct and may negatively affect the accuracy performance.

UPD: I've just checked it and it doesn't seem to negatively affect the accuracy as I don't see any sensible difference between the two each trained on 20 epochs. Could this be the case? And if so, why ?

UPD 2: Since Rescaling (and Normalization, I guess) keras Layer went out of 'experimental' subgroup only in the latest TensorFlow releases, I suggest adding support for earlier versions as well, for compability reasons.

macsunmood commented 3 years ago

Regarding AutoGraph, I think it indeed can be TF version related, because I'm using server which currently only supports max v.2.4.1..

Also, seems like there's new bug after the latest release update (wasn't a thing before): image

leondgarse commented 3 years ago
leondgarse commented 3 years ago

I think TF 2.4.1 should be alright. Here's my testing result on Colab efficientnetV2_basic_test.ipynb , as my local cuda version cannot match TF 2.4.1. Just a simple V2B0 + adam on cifar10. Selection_392

macsunmood commented 3 years ago
  • Using keras.layers.experimental.preprocessing.Rescaling and keras.layers.experimental.preprocessing.Normalization now.

Maybe even better, for everlasting compatibility: https://github.com/leondgarse/keras_efficientnet_v2/pull/9

leondgarse commented 3 years ago

Ya, I thought about that, but getting lazy. lol