Closed hamletbatista closed 4 years ago
This is not a functionality we currently support, so will mark it as enhancement, but at the same time from the log it seems the problem has to do with the way we save the SavedModel rather than a tfjs specific thing, so we will investigate the issue, and that could potentially solve the problem for tfjs too.
@w4nderlust thanks
Hey @hamletbatista, would you mind providing the model_definition.yaml which is referenced in the codelab? Thanks
Actually never mind, I see it in the template.
@ydudin3 glad to know. Please let me know if are able to get this to work
@hamletbatista it seems from the logs that output tensors get appended to the input_tensors list
For example printing input_tensors yields:
{'Category0': <tf.Tensor 'Category0/Category0_placeholder:0' shape=(?,) dtype=int64>, 'Category2': <tf.Tensor 'Category2/Category2_placeholder:0' shape=(?,) dtype=int64>, 'Questions': <tf.Tensor 'Questions/Questions_placeholder:0' shape=(?, ?) dtype=int32>}
It looks like get_tensors function has these few lines in it:
for output_feature in model_definition['output_features']: input_tensors[output_feature['name']] = getattr(model, output_feature['name'])
Is this intentional? I wonder if that's what's causing model load failure.
That is in the cell after "try it again". Not sure why you are running it that way instead of using model.save_savedmodel()
.
That is in the cell after "try it again". Not sure why you are running it that way instead of using
model.save_savedmodel()
.
I tried that first. See cells above.
@hamletbatista it seems from the logs that output tensors get appended to the input_tensors list
For example printing input_tensors yields:
{'Category0': <tf.Tensor 'Category0/Category0_placeholder:0' shape=(?,) dtype=int64>, 'Category2': <tf.Tensor 'Category2/Category2_placeholder:0' shape=(?,) dtype=int64>, 'Questions': <tf.Tensor 'Questions/Questions_placeholder:0' shape=(?, ?) dtype=int32>}
It looks like get_tensors function has these few lines in it:
for output_feature in model_definition['output_features']: input_tensors[output_feature['name']] = getattr(model, output_feature['name'])
Is this intentional? I wonder if that's what's causing model load failure.
I haven't touched this code in a while, but I added links in the comments to where I was finding suggestions to fix the issue.
The suggestion came from this comment it appears https://github.com/uber/ludwig/issues/329#issuecomment-508777581
I took a different route to solve this using HuggingFace's library, but yours would make the tutorial much simpler to follow
I tried that first. See cells above.
Yes but it is commented out, and I don't see errors there, what was wrong with it?
I took a different route to solve this using HuggingFace's library, but yours would make the tutorial much simpler to follow
We are adding import of Huggingface's transformer library in the next version of Ludwig. Wonder how are you serving it as even the smaller distilled model is really expensive to use at inference time.
I tried that first. See cells above.
Yes but it is commented out, and I don't see errors there, what was wrong with it?
There was no error or stack trace. The issue was the file generated seemed corrupted or incomplete when I tried to load it.
I will give it another try over the weekend.
I took a different route to solve this using HuggingFace's library, but yours would make the tutorial much simpler to follow
We are adding import of Huggingface's transformer library in the next version of Ludwig. Wonder how are you serving it as even the smaller distilled model is really expensive to use at inference time.
Yes. I have that problem and this is mostly a learning exercise to teach marketers, not for production use. I have in my queue to investigate this research next https://cloudblogs.microsoft.com/opensource/2020/01/21/microsoft-onnx-open-source-optimizations-transformer-inference-gpu-cpu/
It seems to solve that issue.
There was no error or stack trace. The issue was the file generated seemed corrupted or incomplete when I tried to load it.
Got it, but you can understand that I need to se the error you were getting about corrupted file, otherwise it's difficult to figure out what the problem is.
Ideally you could provide a minimal self contained reproducible zip containing either data and a python script (data can be generated with the data/dataset_sythesizer.py
script if you can't share it) or data + yaml file + command to run it.
Yes. I have that problem and this is mostly a learning exercise to teach marketers, not for production use. I have in my queue to investigate this research next https://cloudblogs.microsoft.com/opensource/2020/01/21/microsoft-onnx-open-source-optimizations-transformer-inference-gpu-cpu/
It seems to solve that issue.
It was tested on 3 layers bert, the latency is much higher on the full model. Still, it's a step forward ;)
Anyway, I thought you usecase was fast inference at deployment time, but if your goal is just demoing, and you don't care about a super scalable inference pipeline, then you can train a model with Ludwig and then serve it with
ludwig serve --model_path path/to/trained/model
and it will launch a REST API server you can query easily. More info in the User Guide.
@w4nderlust just FYI: save_savemodel
works incorrectly now. That's because of wrong placeholders' names https://github.com/uber/ludwig/issues/329#issuecomment-548854347
There was no error or stack trace. The issue was the file generated seemed corrupted or incomplete when I tried to load it.
Got it, but you can understand that I need to se the error you were getting about corrupted file, otherwise it's difficult to figure out what the problem is. Ideally you could provide a minimal self contained reproducible zip containing either data and a python script (data can be generated with the
data/dataset_sythesizer.py
script if you can't share it) or data + yaml file + command to run it.Yes. I will have time over the weekend :)
Yes. I have that problem and this is mostly a learning exercise to teach marketers, not for production use. I have in my queue to investigate this research next https://cloudblogs.microsoft.com/opensource/2020/01/21/microsoft-onnx-open-source-optimizations-transformer-inference-gpu-cpu/ It seems to solve that issue.
It was tested on 3 layers bert, the latency is much higher on the full model. Still, it's a step forward ;)
Interesting. I will see if I can get decent accuracy. Thanks for the insights.
Anyway, I thought you usecase was fast inference at deployment time, but if your goal is just demoing, and you don't care about a super scalable inference pipeline, then you can train a model with Ludwig and then serve it with
ludwig serve --model_path path/to/trained/model
and it will launch a REST API server you can query easily. More info in the User Guide.
I need to run the model in JS to embed in Google Sheets and Excel. Fetching from a serving URL would be my fall back option.
Thanks
@w4nderlust just FYI:
save_savemodel
works incorrectly now. That's because of wrong placeholders' names #329 (comment)
Thanks, yes we are working on it. @ydudin3
The merged PR should have solved the issue. There's also an integration test for SavedModel now that shows how to load and save SavedModels and what kind of preprocessing and postprocessing you need to do in order to map data to tensors and prediction tensors to data: https://github.com/uber/ludwig/blob/master/tests/integration_tests/test_savedmodel.py. Let us know if you have further problems.
Thanks. I will check this out. This was sorely needed!
Describe the bug A clear and concise description of what the bug is. I'm trying to export a trained model so I can run inference using TensorflowJs, but the exported .pb doesn't work with the TensorflowJs converter tool. I get this error:
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1781: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version. Instructions for updating: If using Keras pass *_constraint arguments to layers. Traceback (most recent call last): File "/usr/local/bin/tensorflowjs_converter", line 8, in
sys.exit(pip_main())
File "/usr/local/lib/python3.6/dist-packages/tensorflowjs/converters/converter.py", line 638, in pip_main
main([' '.join(sys.argv[1:])])
File "/usr/local/lib/python3.6/dist-packages/tensorflowjs/converters/converter.py", line 642, in main
convert(argv[0].split(' '))
File "/usr/local/lib/python3.6/dist-packages/tensorflowjs/converters/converter.py", line 591, in convert
strip_debug_ops=args.strip_debug_ops)
File "/usr/local/lib/python3.6/dist-packages/tensorflowjs/converters/tf_saved_model_conversion_v2.py", line 419, in convert_tf_saved_model
model = load(saved_model_dir, saved_model_tags)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/load.py", line 519, in load
return load_internal(export_dir, tags)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/load.py", line 550, in load_internal
root = load_v1_in_v2.load(export_dir, tags)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/load_v1_in_v2.py", line 239, in load
return loader.load(tags=tags)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/load_v1_in_v2.py", line 222, in load
signature_functions = self._extract_signatures(wrapped, meta_graph_def)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/saved_model/load_v1_in_v2.py", line 138, in _extract_signatures
signature_fn = wrapped.prune(feeds=feeds, fetches=fetches)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/wrap_function.py", line 320, in prune
sources=flat_feeds + self.graph.internal_captures)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/lift_to_graph.py", line 260, in lift_to_graph
add_sources=add_sources))
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/op_selector.py", line 413, in map_subgraph
% (repr(init_tensor), repr(op), _path_from(op, init_tensor, sources)))
tensorflow.python.ops.op_selector.UnliftableError: A SavedModel signature needs an input for each placeholder the signature's outputs use. An output for signature 'predict' depends on a placeholder which is not an input (i.e. the placeholder is not fed a value).
Unable to lift tensor <tf.Tensor 'Category0/predictions_Category0/predictions_Category0:0' shape=(?,) dtype=int64> because it depends transitively on placeholder <tf.Operation 'is_training' type=Placeholder> via at least one path, e.g.: Category0/predictions_Category0/predictions_Category0 (ArgMax) <- Category0/predictions_Category0/add (Add) <- Category0/predictions_Category0/MatMul (MatMul) <- concat_combiner/concat_combiner (Identity) <- concat_combiner/concat (Identity) <- Questions/Questions (Identity) <- Questions/dropout/cond/Merge (Merge) <- Questions/dropout/cond/dropout/mul_1 (Mul) <- Questions/dropout/cond/dropout/Cast (Cast) <- Questions/dropout/cond/dropout/GreaterEqual (GreaterEqual) <- Questions/dropout/cond/dropout/rate (Const) <- Questions/dropout/cond/switch_t (Identity) <- Questions/dropout/cond/Switch (Switch) <- is_training (Placeholder)
To Reproduce Steps to reproduce the behavior: You can follow my steps in this colab notebook https://colab.research.google.com/drive/1c1REIK3G5FzwuCxmO8R0xA_0ODDlC57z#scrollTo=vNudSgJAZ7JB
Please provide code, yaml definition file and a sample of data in order to entirely reproduce the issue. Issues that are not reproducible will be ignored.
Everything is in the Colab notebook.
Expected behavior A clear and concise description of what you expected to happen. I am hoping to load the trained model in TensorflowJs
Screenshots If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
Additional context Add any other context about the problem here. I tried the ideas in this article https://github.com/uber/ludwig/issues/329#issuecomment-548854347