tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.
https://js.tensorflow.org
Apache License 2.0
18.51k stars 1.93k forks source link

tfjs_converter issues with TextVectorization Layer #7866

Open jlouyang opened 1 year ago

jlouyang commented 1 year ago

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

Describe the current behavior I'm trying to convert a tensorflow model to js, but I believe the TextVectorization layer isn't supported so I keep getting an error.

Describe the expected behavior

Standalone code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/CodePen/any notebook.

import tensorflow as tf

text_dataset = tf.data.Dataset.from_tensor_slices(["foo", "bar", "baz"])

text_dataset = ['abcdefghijklmnopqrstuvwxyz'] max_features = 5000 # Maximum vocab size. max_len = 15 # Sequence length to pad the outputs to.

Create the layer.

vectorize_layer = tf.keras.layers.TextVectorization( max_tokens=max_features, output_mode='int', output_sequence_length=max_len,split='character',encoding='utf-8')

Now that the vocab layer has been created, call adapt on the

text-only dataset to create the vocabulary. You don't have to batch,

but for large datasets this means we're not keeping spare copies of

the dataset.

vectorize_layer.adapt(text_dataset)

Create the model that uses the vectorize text layer

model = tf.keras.models.Sequential()

Start by creating an explicit input layer. It needs to have a shape of

(1,) (because we need to guarantee that there is exactly one string

input per batch), and the dtype needs to be 'string'.

model.add(tf.keras.Input(shape=(1,), dtype=tf.string))

The first layer in our model is the vectorization layer. After this

layer, we have a tensor of shape (batch_size, max_len) containing

vocab indices.

model.add(vectorize_layer)

Now, the model can map strings to integers, and you can add an

embedding layer to map these integers to learned embeddings.

input_data = [["foo qux bar"], ["qux baz"]] print(model.predict(input_data))

model.save('test_saved_model')

Once the model is saved, if I try to use tensorflowjs_converter to convert to js I get an assertion error stating Identity is not in graph

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.11/bin/tensorflowjs_converter", line 8, in sys.exit(pip_main()) ^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 958, in pip_main main([' '.join(sys.argv[1:])]) File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 962, in main convert(argv[0].split(' ')) File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 948, in convert _dispatch_converter(input_format, output_format, args, quantization_dtype_map, File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflowjs/converters/converter.py", line 654, in _dispatch_converter tf_saved_model_conversion_v2.convert_tf_saved_model( File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflowjs/converters/tf_saved_model_conversion_v2.py", line 979, in convert_tf_saved_model _convert_tf_saved_model(output_dir, saved_model_dir=saved_model_dir, File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflowjs/converters/tf_saved_model_conversion_v2.py", line 821, in _convert_tf_saved_model frozen_initializer_graph) = _freeze_saved_model_v1(saved_model_dir, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflowjs/converters/tf_saved_model_conversion_v2.py", line 436, in _freeze_saved_model_v1 frozen_graph_def = tf.compat.v1.graph_util.convert_variables_to_constants( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow/python/util/deprecation.py", line 371, in new_func return func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow/python/framework/convert_to_constants.py", line 1330, in convert_variables_to_constants ret = convert_variables_to_constants_from_session_graph( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow/python/framework/convert_to_constants.py", line 1286, in convert_variables_to_constants_from_session_graph converter_data=_SessionConverterData( ^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow/python/framework/convert_to_constants.py", line 946, in init graph_def = graph_util.extract_sub_graph(graph_def, output_node_names) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow/python/util/deprecation.py", line 371, in new_func return func(args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow/python/framework/graph_util_impl.py", line 245, in extract_sub_graph _assert_nodes_are_present(name_to_node, dest_nodes) File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/tensorflow/python/framework/graph_util_impl.py", line 198, in _assert_nodes_are_present assert d in name_to_node, "%s is not in graph" % d ^^^^^^^^^^^^^^^^^ AssertionError: Identity is not in graph

gaikwadrahul8 commented 1 year ago

Hi, @jlouyang

Thank you for bringing this issue to our attention and I tried to replicate the same issue from my end with different versions of tensorflowjs like 4.7.*, 4.8.* and 4.9.* and I'm also getting same error message AssertionError: Identity is not in graph, please refer this gist-file

At the moment it seems like there are some Ops or TextVectorization Layer itself does not support in Tensorflow.js while converting the model into TFJs with the help of tensorflow_converter if that is the case then this issue will be considered as feature request

You can refer our official documentation of supported Tensorflow Ops for tfjs_converter, If have I missed something here please let me know ? Thank you!

jlouyang commented 1 year ago

Yes, I think you got everything about the issue. Would I need to submit a new issue as a feature this time? I'm not sure what ops in TextVectorization are not supported though. Thanks for responding quickly!