tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.41k stars 1.92k forks source link

Error import Keras Mobilenet model: Unknown layer: BatchNormalizationV1 #1255

Closed keyyuki closed 5 years ago

keyyuki commented 5 years ago

Hi everyone

I have a problem when run keras mobinet model in tensorflow js.

Browser raise exception:

errors.ts:48 Uncaught (in promise) Error: Unknown layer: BatchNormalizationV1. This may be due to one of the following reasons:
1. The layer is defined in Python, in which case it needs to be ported to TensorFlow.js or your JavaScript code.
2. The custom layer is defined in JavaScript, but is not registered properly with tf.serialization.registerClass().
    at new t (errors.ts:48)
    at deserializeKerasObject (generic_utils.ts:239)
    at deserialize (serialization.ts:31)
    at u (container.ts:1335)
    at t.fromConfig (container.ts:1362)
    at deserializeKerasObject (generic_utils.ts:274)
    at deserialize (serialization.ts:31)
    at models.ts:287
    at common.ts:14
    at Object.next (common.ts:14)

My model train by python with keras application mobilenet

mobileModel = keras.applications.mobilenet.MobileNet(weights=None,classes=4, input_shape=(150,150,3))
mobileModel.compile(optimizer=tf.keras.optimizers.Adam(0.01), 
              loss='binary_crossentropy',
              metrics=['accuracy'])

ref: https://keras.io/applications/#mobilenet

There is my h5 file: https://drive.google.com/file/d/10YmV7jfR0ooPxzXUbD1s1Y0X1FjRLeyz/view?usp=sharing

TensorFlow.js version: 0.15.1 Browser: chrome Version 72.0.3626.109 (Official Build) (64-bit)

Thank you very much

caisq commented 5 years ago

@keyyuki What versions of keras and tensorflow are you using?

keyyuki commented 5 years ago

@caisq I train on google colab, tensorflow 1.13.0 rc I found that it error on tf 1.13, but stable version 1.12 is ok.

caisq commented 5 years ago

cc @reedwm I think this is related to the new Layer names "BatchNormalizationV1" and "BatchNormaliztionV2" introduced about three months ago. @reedwm may have more thoughts on that.

But it seems like that we need to accommodate this change on the TF.js side, possibly by registering the existing BatchNormalization layer under the new name "BatchNormalizationV1".

reedwm commented 5 years ago

Yeah, the class name is now either BatchNormalizationV1 or BatchNormalizationV2. In Python, the layer is still referred to as tf.keras.layers.BatchNormalization.

I do not know enough about TF.js to know what the issue is, but as @caisq said, it seems this needs to be changed on the TF.js side

karmel commented 5 years ago

@yhliang , can you take a look?

yhliang2018 commented 5 years ago

On Keras side, we can serialize both BatchNormalizationV1 and BatchNormalizationV2 as BatchNormalization for model exporting. In this case, when the model is loaded into TF.js, the class name will be correct. I'm creating a CL for it.

Note that, if the model with BatchNorm layer is loaded in TF.js, it will be taken as the BatchNorm layer of TF.js without distinguishing the layer version (BatchNormV1 or BatchNormV2). TF.js should also provide two implementations of BatchNorm to users if it's needed.

caisq commented 5 years ago

Thanks, @yhliang2018. This should have been fixed at HEAD of tensorflow by https://github.com/tensorflow/tensorflow/commit/8714fa2c4f853c66faf730d8dad0a2784dcd4908

I'm closing this issue.

caisq commented 5 years ago

Just to be clear, the final release of tensorflow 1.13 (1.13.0) that's coming out soon should fix this issue.

alvarouc commented 5 years ago

It is fixed but now we get a different error

ValueError: Improperly formatted model config.

:( so frustrating

SanthoshRajendiran commented 5 years ago

It is fixed but now we get a different error

ValueError: Improperly formatted model config.

:( so frustrating

Same issue pertains for the tflite conversion in V2.0. Any way identified to overcome the issue??

caisq commented 5 years ago

@SanthoshRajendiran For tflite-related problems, please file issues at https://github.com/tensorflow/tensorflow/issues

For TensorFlow.js, we are considering adding a logic in our converter to take care of the wrong class names such as BatchNormalizationV1.

caisq commented 5 years ago

@alvarouc This "improperly formatted..." error is related to another change in the serialization format that happened in tensorflow recently. cc @karmel @fchollet

We are in contact with them to determine what the best fixing approach is.

nandeeka commented 5 years ago

I am getting the same error as above even though this was supposed to be fixed with TensorFlow 1.13. I am using TensorFlow version is 1.13.1 to train the model and tensorflow 1.0.1 for conversion and to run the model. The error message is:

Uncaught (in promise) Error: Unknown layer: BatchNormalizationV1. This may be due to one of the following reasons:
1. The layer is defined in Python, in which case it needs to be ported to TensorFlow.js or your JavaScript code.
2. The custom layer is defined in JavaScript, but is not registered properly with tf.serialization.registerClass().
    at new e (tfjs@1.0.0:2)
    at Rp (tfjs@1.0.0:2)
    at cd (tfjs@1.0.0:2)
    at u (tfjs@1.0.0:2)
    at e.fromConfig (tfjs@1.0.0:2)
    at Rp (tfjs@1.0.0:2)
    at cd (tfjs@1.0.0:2)
    at e.fromConfig (tfjs@1.0.0:2)
    at Rp (tfjs@1.0.0:2)
    at cd (tfjs@1.0.0:2)

You can view the page at https://nandeeka.github.io/nosearcade-pages/. Please let me know if there is something I should be doing differently to incorporate the fix.

caisq commented 5 years ago

@nandeeka That's a known issue in tensorflow 1.13. Switching back to 1.12 should address the issue.

nandeeka commented 5 years ago

That worked. Thank you so much!

RadEdje commented 5 years ago

I'm currently using tensorflow 2.0 alpha. I used the latest tensorflowjs converter. I'm amazed that the usual 20+ shards has now been braught down to just 3. however... I'm back to the error:

tfjs@1.0.0:2 Uncaught (in promise) Error: Unknown layer: BatchNormalizationV1. This may be due to one of the following reasons:
1. The layer is defined in Python, in which case it needs to be ported to TensorFlow.js or your JavaScript code.
2. The custom layer is defined in JavaScript, but is not registered properly with tf.serialization.registerClass().

Any suggestions? thanks.

keyyuki commented 5 years ago

@RadEdje You can replace BatchNormalizationV1 to BatchNormalization in your model.json. But I not sure about accuracy of model after that. You should turn back to tensorflow 1.12.x and tensorflowjs 0.8.5 (python). For javascript site, >0.13. But not sure that accuracy still same with python. You must re-evaluate in javascript. As my experience, it not the same. I dont know that because of my code or tensorflow-converter. So be careful with it, and please confirm if your model is ok or not.

RadEdje commented 5 years ago

@RadEdje You can replace BatchNormalizationV1 to BatchNormalization in your model.json. But I not sure about accuracy of model after that. You should turn back to tensorflow 1.12.x and tensorflowjs 0.8.5 (python). For javascript site, >0.13. But not sure that accuracy still same with python. You must re-evaluate in javascript. As my experience, it not the same. I dont know that because of my code or tensorflow-converter. So be careful with it, and please confirm if your model is ok or not.

thanks for the quick reply.

i'm NOW stuck with that second type of error (the one that gets past the batchNormalizationV1 bug).

i'm now at the

Error: Improperly formatted model config for layer

Any suggestions on how to fix this?

How do I force the version of tensorflowjs?

referring to your suggestion earlier to use version 0.8.5

tensorflowjs 0.8.5 (python)

do I just do the following?

pip install tensorflowjs@0.8.5

thanks again.

RadEdje commented 5 years ago

I tried something... i took the model.h5 file saved using tf.keras api to save into an hd5 format.

i then used vanilla keras (ver 2.2.4) to load that model.h5 file to see if keras.models.load_model is compatible with the h5 file that tf.keras save model produces.

It can't load the h5 modle.

Here is the error:

lib\site-packages\keras\backend\tensorflow_backend.py", line 517, in placeholder
    x = tf.placeholder(dtype, shape=shape, name=name)
AttributeError: module 'tensorflow' has no attribute 'placeholder'

seems tf.keras save/load model is not compatible with vanilla keras save/load model.

I was hoping to get the h5 model of tf.keras, then save it using vanilla keras and convert that using tensorflowjs_converter. Obviously this does not work with tensorflow 2.0 alpha.

keyyuki commented 5 years ago

haha same with me. I surrender now.

tf@1.12 cannot load h5 file. tf@1.13 and tf@2.0 got error BatchNormalizationV1. Look like it need some update from javascript lib

Now my thesis delay from 12/2018. And I'm waiting update from any of them, tensorflow-convertor, tensorflow, tensorflowjs, tensorflow.js.

Tensorflow.js still not good at convert keras.application model, not only mobinet, but also other like resnet, nasnet... So you may consider changing your project.

And next is data preprocess. If image gray, it ok. But if image is rgb, there are some problem. In javascript, tf.browser.fromPixel load rgb with different channel order with cv.readim in python. So becarefull, you must reorder before train in python side or predict in javascript side.

I think tensorflow team must have an example that have 2 side: train image in python and load it in javascript, with data is rgb image

caisq commented 5 years ago

@keyyuki "tf@1.12 cannot load h5 file" Can you elaborate on that?

Reverting to TF 1.12 should address these issues (which we are trying to fix for later releases of TF and TF.js)

RadEdje commented 5 years ago

Hi, can I train w/ tf 2.0 alpha in a different virtual environment.

Then install tf 1.12 in a second venv and use that version of tf for converting w/ the latest tfjs_converter script?

Or I'll end up u/ the same error? Since the h5 I'm converting is the one made my tf 2.0 alpha w/c contains the serialization bug? So using an older stable version of tf 1.12 will have no effect?

https://github.com/tensorflow/tfjs/issues/1255#issuecomment-481742650

keyyuki commented 5 years ago

@caisq sorry for my late response, just because it midnight in my country

Here is my reproduct h5 load error on colab (error still there), from train -> save -> load.

TypeError: '<' not supported between instances of 'dict' and 'float'

https://colab.research.google.com/drive/1Up4gKEtXpL8u4pX_silAL1LO1Mg67tRk

RadEdje commented 5 years ago

@keyyuki "tf@1.12 cannot load h5 file" Can you elaborate on that?

Reverting to TF 1.12 should address these issues (which we are trying to fix for later releases of TF and TF.js)

Greetings, so i built a new venv i had to install python 3.6 since tf 1.12 doesn't support 3.7 yet i installed python 3.6 tried doing a pip install of tensorflowjs 0.85 and 1.0.1.

It kept throwing this error:

for tensorflowjs 0.85

Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory...
AppData\\Local\\Temp\\pip-uninstall-e68cw0zc\\sync website dev\\my website projects\\python_projects\\tf12\\lib\\site-packages\\tensorboard\\_vendor\\tensorflow_serving\\sources\\storage_path\\__pycache__\\file_system_storage_path_source_pb2.cpython-36.pyc

for tensorflowjs 1.0.1

Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory...
AppData\\Local\\Temp\\pip-install-pyuhksfp\\tf-nightly-2.0-preview\\tf_nightly_2.0_preview-2.0.0.dev20190411.data/purelib/tensorflow/include/tensorflow/include/external/eigen_archive/Eigen/src/Core/products/GeneralMatrixMatrixTriangular_BLAS.h

so the converter only runs with tensorflow 2.0 alpha exept it has the batchnormalizationV1 or serialization bugs.

Downgrading won't let me install the converter.

Any suggestions please? Thanks.

caisq commented 5 years ago

@RadEdje The tensorflow (python) team has fixed the BatchNormalizationV1 issue, but the fix is not in 2.0.0-alpha0, instead, you need to do pip install -U tf-nightly-2.0-preview in order to get it. Make sure you do it in a clean virtualenv or pipenv to avoid conflicts with already-installed versions.

RadEdje commented 5 years ago

@RadEdje The tensorflow (python) team has fixed the BatchNormalizationV1 issue, but the fix is not in 2.0.0-alpha0, instead, you need to do pip install -U tf-nightly-2.0-preview in order to get it. Make sure you do it in a clean virtualenv or pipenv to avoid conflicts with already-installed versions.

Hi. thanks so new venv. couple of questions before I proceed;

1) do I keep the -U flag? that won't affect my other virtual enviroments? it just means upgrade correct? can i get the specific version instead? so i know which version is working now and just use that specifically instead using pip?

2) is it best to use pyton 3.6 or 3.7 with this? i've seen that older versions don's support 3.7 yet but 2.0alpha and most likely 3.7 require some things from 3.7.

3)Can i use the old h5 model from tensorflow 2.0-alpha and convert it with this nightly release or I'll have to retrain from scratch.

4) this fixes the BatchNormalizationV1 bug. does it fix the serialization bug too giving me this error in the browser;

Error: Improperly formatted model config for layer

just wanted to say thanks again for the hard work and helping us with tensorflow. more power to the dev team.

RadEdje commented 5 years ago

@RadEdje The tensorflow (python) team has fixed the BatchNormalizationV1 issue, but the fix is not in 2.0.0-alpha0, instead, you need to do pip install -U tf-nightly-2.0-preview in order to get it. Make sure you do it in a clean virtualenv or pipenv to avoid conflicts with already-installed versions.

build a new venv,

tried installing with python 3.7.... did not work... package could not be found..

tried installing with python 3.6... found the package but threw this error:

Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: 'C:\\Users\\---------------------\\AppData\\Local\\Temp\\pip-install-3z0lhn8u\\tf-nightly-2.0-preview\\tf_nightly_2.0_preview-2.0.0.dev20190411.data/purelib/tensorflow/include/tensorflow/include/external/eigen_archive/Eigen/src/Core/products/GeneralMatrixMatrixTriangular_BLAS.h

this still works though

pip install tensorflow==2.0.0a0
RadEdje commented 5 years ago

Hello, since I was having difficulties installing the nighty version of tf 2.0 using

pip install -U tf-nightly-2.0-preview

on WIndows (because windows has a limit to the pathname of only 260 characters)

i fired up UBUNTU.

installed the latest nightly currently at tf-nightly-2.0-preview 2.0.0.dev20190412

then installed the latest tensorflowjs

then used the conversion script

and still

i get the BatchNormalizationV1 bug...

so I edited the .json file and replaced all the BatchNormlizationV1 instances with just plain BatchNormalization

this still results in


Error: Improperly formatted model config for layer {"_callHook":null,"_addedWeightNames":[],"_stateful":false,"id":1,"activityRegularizer":null,"inputSpec":[{"ndim":4,"axes":{}}],"supportsMasking":false,"_trainableWeights":[],"_nonTrainableWeights":[],"_losses":[],"_updates":[],"_built":false,"inboundNodes":[],"outboundNodes":[],"name":"Conv1_pad","trainable_":true,"updatable":true,"initialWeights":null,"_refCount":null,"fastWeightInitDuringBuild":false,"dataFormat":"channelsLast","padding":[[0,1],[0,1]]}: "input_1"

I'm using the h5 file from tf2.0-alpha release (not the last nightly)

i have not tried retraining with the nightly

i will try that tomorrow though. still hoping it will finally work and I get to finally deploy the model.

Any suggestions though will be more than welcome.

thanks.

RadEdje commented 5 years ago

just wanted to update: re-trained on linux/ubuntu with the latest tf nightly 2.0.... it all works now. thanks dev team.

dalisoft commented 4 years ago

@RadEdje great you solved. is do you think about describe to other users how to re-train? would be nice to see how we can re-train and/or solve issue. thanks