tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.5k stars 1.93k forks source link

Working ONNX file converted to tfjs (via tf SavedModel) doesn't work #5832

Open josephrocca opened 3 years ago

josephrocca commented 3 years ago

System information

Describe the current behavior I have an ONNX file here and I've converted it to both tfjs and tflite (using Google Colab) as shown at the bottom of this ipynb file (here's a direct link to the Colab notebook). The relevant lines are:

!onnx-tf convert -i anime-gan-v2.onnx -o anime-gan-v2-tf
!tensorflowjs_converter --input_format tf_saved_model ./anime-gan-v2-tf ./anime-gan-v2-tfjs
tflite_model = tf.lite.TFLiteConverter.from_saved_model("./anime-gan-v2-tf") .convert()

You can grab the code and play around with it like this:

git clone https://github.com/josephrocca/anime-gan-v2-web
cd anime-gan-v2-web
# start static file server, e.g.:
deno run --allow-net --allow-read=. https://raw.githubusercontent.com/josephrocca/denoSimpleStatic/master/main.ts --port=3001

Describe the expected behavior Since the model works with ONNX Runtime Web, I'd have expected it to work in tfjs and tflite without errors. That said, it could be a problem with the onnx-tensorflow or tensorflowjs converters.

Standalone code to reproduce the issue

Other info / logs Include any logs or source code that would be helpful to tfjs cpu backend: image

tfjs webgl backend: image

tflite: image

thekevinscott commented 1 year ago

I'm seeing a similar bug with converting ONNX to Tensorflow.js, though the error is different.

I'm converting a DDNM model from Pytorch -> ONNX -> Tensorflow -> Tensorflow.js. Pytorch, ONNX, and Tensorflow (saved model) all produce correct outputs. But Tensorflow.js fails with:

        var outputMetadata = this.binding.executeOp(name, opAttrs, this.getInputTensorIds(inputs), 1);
                                          ^

Error: Invalid TF_Status: 3
Message: Input to reshape is a tensor with 0 values, but the requested shape has 8

   at NodeJSKernelBackend.executeSingleOutput (/notebooks/code/DDNM/node_modules/@tensorflow/tfjs-node-gpu/dist/nodejs_kernel_backend.js:219:43)
    at Object.kernelFunc (/notebooks/code/DDNM/node_modules/@tensorflow/tfjs-node-gpu/dist/kernels/Reshape.js:34:27)
    at kernelFunc (/notebooks/code/DDNM/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4651:32)
    at /notebooks/code/DDNM/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4711:27
    at Engine.scopedRun (/notebooks/code/DDNM/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4516:23)
    at Engine.runKernelFunc (/notebooks/code/DDNM/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4707:14)
    at Engine.runKernel (/notebooks/code/DDNM/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4580:21)
    at reshape_ (/notebooks/code/DDNM/node_modules/@tensorflow/tfjs-converter/dist/tf-converter.node.js:12900:19)
    at Object.reshape__op [as reshape] (/notebooks/code/DDNM/node_modules/@tensorflow/tfjs-converter/dist/tf-converter.node.js:11996:29)
    at executeOp$1 (/notebooks/code/DDNM/node_modules/@tensorflow/tfjs-converter/dist/tf-converter.node.js:29499:25)

I've tested the ONNX model in both Python and Javascript (with onnxruntime-node) and both work.

I wonder if this issue could be related?

(Happy to provide the model files if it's helpful.)

thekevinscott commented 1 year ago

Interestingly, if I switch the backend (in Node) to cpu, I get an identical error as josephrocca's above:

Error: GatherV2: the index value 0 is not in [0, -1]
gaikwadrahul8 commented 1 year ago

Hi, @josephrocca

Apologize for the delayed response and we're re-visiting our older issues and checking whether those issues got resolved or not as of now so May I know are you still looking for the solution or your issue got resolved ?

If issue still persists after trying with latest version of TFJs please let us know with error log and code snippet with steps to reproduce the same issue from our end ?

Could you please confirm if this issue is resolved for you ? Please feel free to close the issue if it is resolved ? Thank you!

josephrocca commented 1 year ago

Hi @gaikwadrahul8 I've just tested the original instructions and they still reproduce the bug with the latest tfjs/tfjs-tflite versions:

gaikwadrahul8 commented 1 year ago

Hi, @josephrocca

Apologize for the delayed response and I tried to replicate the same issue from my end and I'm also getting same error message for tfjs model which you've mentioned above in error logs section and at the moment it seems like there is some issue while converting ONNX model to tfjs and tfjs-tflite but it's working as expected with ONNX Runtime Web so we'll have to dig more into this issue and we'll update you soon. thank you for noticing this issue, I really appreciate your time and efforts. Thank you!

Error screenshot for TFJs :

image

For tfjs-tflite model may be due permission policy header:

image

PrinandaRahmatullah commented 9 months ago

I've tried several onnx, tensorflow, and tensorflow-js versions to export .pt model -> ONNX -> TfLite -> Then to be a .tfjs. I'm trying the code from https://github.com/Hyuto/yolov8-tfjs. When I use his model, It runs perfectly. But when I convert my custom dataset model of yolov8n.pt to tfjs, it's failed. The ONNX version model also runs seamlessly.

Here I mention an image of the error

Screenshot 2024-01-25 at 00 11 18