tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.36k stars 1.92k forks source link

Blank Output After Tensorflow Saved Model to Graph Model Conversion (works on TFLite model) #6827

Closed dasmehdix closed 1 year ago

dasmehdix commented 2 years ago

Hi all. I trained my custom model on Python Tensorflow and saved as saved_model format(protobuf). After, I converted protobuf model to tflite and Graph format(.json and bin files). I already inference both graph model and tflite model. The outputs are shown below. TfLite model output is correct but graph model output is blank(just black image). There is no problem in shapes of model for both model. Btw, it's a segmentation model. How can I take right output from frozen graph model? Adsız

rthadur commented 2 years ago

@dasmehdix can you please provide commands which you have used to convert and version of tfjs-converter ? Note: frozen model in deprecated in tfjs-converter.

dasmehdix commented 2 years ago
tensorflowjs_converter \
    --input_format=tf_saved_model \
    --output_format=tfjs_graph_model \
    --signature_name=serving_default \
    --saved_model_tags=serve \
    model_v1.pb \
    .

Also, I converted Tensorflow Saved Model to Keras Saved Model and Keras HDF5 format. After, I converted both Keras Saved Model & Keras HDF5 Model to Graph Model(and layer model). The output is same for every attemp. @rthadur

Update: I also converted from Tensorflow Saved Model(.pb) to Tensorflow Frozen Graph(.pb). After I converted to Graph Model(.json). The output is also same for this trial. @pyu10055

pyu10055 commented 2 years ago

@dasmehdix have you tried tfjs layers model? you can convert keras to tjfs layers model directly. There could be different reasons for the failure, could be the conversion or the inference, if you can share the model or provide reproducible examples would be help us to investigate. thanks.

dasmehdix commented 2 years ago

I have converted from keras to layers model succesfully. After, when I tried to load layers model on JS I get the error below. Probably, TfOpLambda layer in keras is not implemented on TFJS.(TfOpLambda is not used in base model, It occurs when I converted from Tensorflow Saved Model to Keras HDF5)

generic_utils.ts:243 Uncaught (in promise) Error: Unknown layer: TFOpLambda. This may be due to one of the following reasons:
1. The layer is defined in Python, in which case it needs to be ported to TensorFlow.js or your JavaScript code.
2. The custom layer is defined in JavaScript, but is not registered properly with tf.serialization.registerClass().

The Models Provided: TF Saved Model Link TFJS Graph Model Link (Converted From Saved Model) TFJS Layers Model Link(Converted From Keras Model)

Reminder: TfLite model which is converted from Tensorflow Saved Model(.pb) is working on JS but the TFLite JS API is on alpha so Tflite inference is taking much time than expected(much slower than Python). Btw, your support is really important for me. It's an academical research with industrial supported. I have to solve this issue. Thanks for help. @pyu10055

pyu10055 commented 2 years ago

@dasmehdix thanks, can you provide a codepen example for the problem you are seeing?

dasmehdix commented 2 years ago

@pyu10055 I tried to use codepen but I am not much familiar with that. So, I have prepared several scripts to show you the problem. First of all there is a python script with model & weight (TF Saved Model) which shows the correct output of the model. Also, there are two more repos that shows the output of the Graph model(JS Converted). One of them is works on Windows, other one works on MacOs & Linux. To run Linux based script just yarn &yarn start respectively on the folder that contains repo. To run Windows based script just yarn & yarn start_windows respectively.

Python Script JS Script(Linux&Mac) JS Script(Windows)

Also, I attached the output of Python TF Saved Model in Python, TFlite model in JS, Graph Model on JS. Adsız

mattsoulanille commented 2 years ago

Hi @dasmehdix. I've found two issues preventing the demo from working.

  1. The image is read into a texture before it's loaded, so it's all zeroes.
  2. Only the top left corner of the image is loaded into the tensor.

I've changed how the image is loaded and translated into a tensor, but I'm still getting a weird output from the model. Maybe you have a better idea of what's going on here. I tried to match what the python version is doing, but I may have missed a step.

issue_6827.zip

dasmehdix commented 2 years ago

Hi @dasmehdix. I've found two issues preventing the demo from working.

  1. The image is read into a texture before it's loaded, so it's all zeroes.
  2. Only the top left corner of the image is loaded into the tensor.

I've changed how the image is loaded and translated into a tensor, but I'm still getting a weird output from the model. Maybe you have a better idea of what's going on here. I tried to match what the python version is doing, but I may have missed a step.

issue_6827.zip

Thank you for correcting. You probably get output as shown below which is senseless. That's the problem. @pyu10055 senseless_output

pyu10055 commented 2 years ago

@dasmehdix before we dive into debugging the model, we need your help to verify if the input in @mattsoulanille example is the same as your python inference version.

dasmehdix commented 2 years ago

@dasmehdix before we dive into debugging the model, we need your help to verify if the input in @mattsoulanille example is the same as your python inference version.

@pyu10055 Yes, they are clearly same input. Also, I already inference Tflite model in JavaScript with provided input which works perfectly.

pyu10055 commented 2 years ago

@dasmehdix Do you have example code for running your model with tfjs-tflite?

dasmehdix commented 2 years ago

@dasmehdix Do you have example code for running your model with tfjs-tflite?

@pyu10055 Yes. Tflite examples attached. Tflite Example for Linux Tflite Example for Windows

mattsoulanille commented 1 year ago

It seems like there might be something wrong with StatefulPartitionedCall/model/conv2d/BiasAdd, which is the first node in the model graph. It's outputting different things on tfjs CPU and TensorFlow native. The outputs are different by an average of 0.004 (for tensors whose values range from around -8 to 3, but also are often near zero).

dasmehdix commented 1 year ago

@mattsoulanille StatefulPartitionedCall/model/conv2d/BiasAdd is (probably) embedded to Add layer. tf.keras.layers.Add is such a popular layer in CNNs. How Add layer outputs different in tfjs & Tensorflow native? Are there other people that encounter same problem?

@pyu10055 Is there any update? Have you had a chance to try the model??

mattsoulanille commented 1 year ago

@dasmehdix We've done some more investigation, and it's actually probably not the add layer that's the issue. It looks like some of the constants in the TFJS model file are incorrect. This is likely a bug with tfjs-converter. I'll update this issue when I have more details on what's causing the bug.

pyu10055 commented 1 year ago

@dasmehdix we have identified the conversion problem and submitted a fix, you can try again after the new release is published, which should be expected next week.

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No