tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.36k stars 1.92k forks source link

LayerNormalization conversion to Graph Model: mean must have the same number of elements as the channels of x #6864

Open cosminc98 opened 1 year ago

cosminc98 commented 1 year ago

System information

Describe the current behavior When converting a model with only a LayerNormalization layer (with the initialized weights; no training) to tensorflow js GraphModel and then using the model from nodejs (with the tfjs-node backend) an error is enountered:

Error: Invalid TF_Status: 3
Message: When is_training=false, mean must have the same number of elements as the channels of x, got 0 and 1
    at NodeJSKernelBackend.executeMultipleOutputs (/usr/lib/node_modules/@tensorflow/tfjs-node/dist/nodejs_kernel_backend.js:230:43)
    at /usr/lib/node_modules/@tensorflow/tfjs-node/dist/kernels/FusedBatchNorm.js:62:28
    at /usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4429:22
    at Engine.scopedRun (/usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4439:23)
    at Engine.tidy (/usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4428:21)
    at Object.tidy (/usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:10481:19)
    at Object.kernelFunc (/usr/lib/node_modules/@tensorflow/tfjs-node/dist/kernels/FusedBatchNorm.js:29:23)
    at kernelFunc (/usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4574:32)
    at /usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4634:27
    at Engine.scopedRun (/usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4439:23)

Describe the expected behavior The converted GraphModel outputs the correct values (when compared to the python model).

Standalone code to reproduce the issue Saved model -> Graph model conversion:

import tensorflow as tf
from tensorflowjs.converters.tf_saved_model_conversion_v2 import convert_tf_saved_model

def make_model(input_shape):
    inputs = tf.keras.Input(shape=input_shape, name='inputs')
    norm_layer = tf.keras.layers.LayerNormalization()
    outputs = norm_layer(inputs)
    return tf.keras.Model([inputs], [outputs])

if __name__ == '__main__':
    model = make_model(
        input_shape=(5,),
    )
    tf.saved_model.save(model, './saved_model')
    convert_tf_saved_model(
        saved_model_dir='./saved_model',
        output_dir='./graph_model',
        control_flow_v2=False,
    )

Graph model inference from nodejs:

const tf = require('@tensorflow/tfjs-node');
const main = async () => {
    const model = await tf.loadGraphModel(`file://./graph_model/model.json`);
    const inputs = tf.randomUniform([1, 5]);
    const outputs = model.execute(inputs);
    console.log(outputs.arraySync());
};
main();
cosminc98 commented 1 year ago

Please notify me if more information is required to reproduce the error.

pyu10055 commented 1 year ago

@cosminc98 thank you for reporting, can you share the original python model and converted tfjs model? It will help the team to expedite the fixes. thanks.

cosminc98 commented 1 year ago

@pyu10055 Sorry for the delay. There isn't much I can tell you about the original python model. I'm trying to convert the decoder half of Conformer from TensorflowASR. At the moment I was testing whether each component can be converted to tfjs graph model separately (tf.while_loops, other layers) and then I would have attempted to convert the greedy decoder as a whole. I cannot provide the tfjs model since this is the problem itself. I can however point you to where to download a pretrained model.

gaikwadrahul8 commented 1 year ago

Hi, @cosminc98

Apologize for the delayed response and it seems like there is some packages dependancy issue and I found somewhat similar issue over stack-overflow so if possible could you please try with latest version of @tensorflow/tfjs-node ? if issue still persists please let us know and I would request you to please help us with package.json dependancy file information, In order to expedite the trouble-shooting process to reproduce the issue reported here. Thank you!

google-ml-butler[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you.

cosminc98 commented 1 year ago

Reproduced the error with the latest version from npm:

Have I written custom code: Yes OS Platform and Distribution: AlmaLinux release 8.6 (Sky Tiger) TensorFlow.js installed from: npm TensorFlow.js version (use command below): 4.2.0 Tensorflow.js Converter Version: 4.3.0

/usr/lib/node_modules/@tensorflow/tfjs-node/dist/nodejs_kernel_backend.js:232
        var outputMetadata = this.binding.executeOp(name, opAttrs, this.getInputTensorIds(inputs), numOutputs);
                                          ^

Error: Invalid TF_Status: 3
Message: When is_training=false, mean must have the same number of elements as the channels of x, got 0 and 1
    at NodeJSKernelBackend.executeMultipleOutputs (/usr/lib/node_modules/@tensorflow/tfjs-node/dist/nodejs_kernel_backend.js:232:43)
    at /usr/lib/node_modules/@tensorflow/tfjs-node/dist/kernels/FusedBatchNorm.js:63:28
    at /usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4506:22
    at Engine.scopedRun (/usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4516:23)
    at Engine.tidy (/usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4505:21)
    at tidy (/usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:8053:19)
    at Object.kernelFunc (/usr/lib/node_modules/@tensorflow/tfjs-node/dist/kernels/FusedBatchNorm.js:30:32)
    at kernelFunc (/usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4651:32)
    at /usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4711:27
    at Engine.scopedRun (/usr/lib/node_modules/@tensorflow/tfjs-node/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4516:23)
google-ml-butler[bot] commented 1 year ago

Closing as stale. Please @mention us if this needs more attention.

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No

gaikwadrahul8 commented 1 year ago

Hi, @cosminc98

Apologize for the inconvenience caused to you and our google-ml-butler closed your issue and It seems like our team did some changes in the working behaviour of google-ml-butler and concerned team will fix this unexpected behaviour of google-ml-butler very soon now I have re-opened your issue. Thank you!

gaikwadrahul8 commented 1 year ago

Hi, @cosminc98

Apologize for the delayed response and I tried to replicate the same issue from the code which you provided in the issue template and first I tried Saved model -> Graph model conversion and after that I provided absolute path to downloaded Graph Model and I tried Graph model inference from Node.jsand I encountered different error TypeError: model.execute is not a function I have tried with latest version @tensorflow/tfjs-node@4.4.0 and @tensorflow/tfjs-node@4.2.0, For your reference I have added screenshot below, If have I missed something here please let me know ? Thank you!

CC :@pyu10055

image

gaikwadrahul8 commented 1 year ago

Hi, @cosminc98

Apologize for the delayed response and I was trying to replicate the same issue from my end and I'm also getting the same error message as mentioned above so we'll have to dig more into this issue and will update you soon

Thank you for bringing this issue to our attention, I really appreciate your efforts and time. Thank you!

CC :@pyu10055

Please refer below error log output :

gaikwadrahul-macbookpro:test-6864 gaikwadrahul$ node index.js
/Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-node/dist/nodejs_kernel_backend.js:229
        var outputMetadata = this.binding.executeOp(name, opAttrs, this.getInputTensorIds(inputs), numOutputs);
                                          ^

Error: Invalid TF_Status: 3
Message: When is_training=false, mean must have the same number of elements as the channels of x, got 0 and 1
    at NodeJSKernelBackend.executeMultipleOutputs (/Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-node/dist/nodejs_kernel_backend.js:229:43)
    at /Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-node/dist/kernels/FusedBatchNorm.js:63:28
    at /Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4559:22
    at Engine.scopedRun (/Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4569:23)
    at Engine.tidy (/Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4558:21)
    at tidy (/Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:8304:19)
    at Object.kernelFunc (/Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-node/dist/kernels/FusedBatchNorm.js:30:32)
    at kernelFunc (/Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4704:32)
    at /Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4764:27
    at Engine.scopedRun (/Users/gaikwadrahul/Desktop/TFJS/test-6864/node_modules/@tensorflow/tfjs-core/dist/tf-core.node.js:4569:23)

Node.js v18.17.0
gaikwadrahul-macbookpro:test-6864 gaikwadrahul$