Closed fming closed 4 years ago
please check whether libtensorflow_framework.so
is in the directory bazel-bin/tensorflow
Thanks, I've checked it and add the follow soft link:
ln -s libtensorflow_framework.so.2.0.2 libtensorflow_framework.so ln -s libtensorflow_framework.so.2.0.2 libtensorflow_framework.so.2
then I can build the asr, but after running it raise this error:
Start argmax decoding ... 2020-09-19 22:32:57.082097: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] model_pruner failed: Internal: Could not find node with name 'transformer_encoder/transformer_encoder_layer_11/layer_normalization_23/batchnorm/add_1' 2020-09-19 22:33:24.347057: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:502] model_pruner failed: Internal: Could not find node with name 'transformer_encoder/transformer_encoder_layer_11/layer_normalization_23/batchnorm/add_1' Segmentation fault (core dumped)
Did the script output any error message when you run python athena/deploy_main.py *.json
?
@neneluo Thanks for replying.
There are two json files under the folder, which one I should use?
examples/asr/timit/configs/mtl_transformer_sp_101.json examples/asr/timit/configs/mtl_transformer_sp.json
I tried both, both has similar warnings like below
` None WARNING:tensorflow:From athena/deploy_main.py:61: The name tf.keras.backend.get_session is deprecated. Please use tf.compat.v1.keras.backend.get_session instead.
WARNING:tensorflow:From athena/deploy_main.py:61: The name tf.keras.backend.get_session is deprecated. Please use tf.compat.v1.keras.backend.get_session instead.
INFO:absl:output_names: ['transformer_encoder/transformer_encoder_layer_8/layer_normalization_17/batchnorm/add_1', 'strided_slice_1']
WARNING:tensorflow:From athena/deploy_main.py:45: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.convert_variables_to_constants
WARNING:tensorflow:From athena/deploy_main.py:45: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.convert_variables_to_constants
WARNING:tensorflow:From /home/ming/venv_athena/lib64/python3.6/site-packages/tensorflow_core/python/framework/graph_util_impl.py:275: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
WARNING:tensorflow:From /home/ming/venv_athena/lib64/python3.6/site-packages/tensorflow_core/python/framework/graph_util_impl.py:275: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
WARNING:tensorflow:From athena/deploy_main.py:46: remove_training_nodes (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.remove_training_nodes
WARNING:tensorflow:From athena/deploy_main.py:46: remove_training_nodes (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.remove_training_nodes
INFO:absl:output_names: ['strided_slice_3']
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
..........
..........
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'v' for (root).model.model.transformer.decoder.layers.2.ffn.layer_with_weights-0.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'v' for (root).model.model.transformer.decoder.layers.2.ffn.layer_with_weights-0.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'v' for (root).model.model.transformer.decoder.layers.2.ffn.layer_with_weights-0.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'v' for (root).model.model.transformer.decoder.layers.2.ffn.layer_with_weights-1.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'v' for (root).model.model.transformer.decoder.layers.2.ffn.layer_with_weights-1.kernel
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'v' for (root).model.model.transformer.decoder.layers.2.ffn.layer_with_weights-1.bias
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'v' for (root).model.model.transformer.decoder.layers.2.ffn.layer_with_weights-1.bias
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/alpha/guide/checkpoints#loading_mechanics for details.
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/alpha/guide/checkpoints#loading_mechanics for details.
`
Use the one which you used for model training.
I guess the bug is caused by the mismatch between output names that specified in tensor_utils.cpp and pb. Try to change the first 'output_name' in the function createOutputNameStructureEncoder
of deploy/src/tensor_utils.cpp
to transformer_encoder/transformer_encoder_layer_8/layer_normalization_17/batchnorm/add_1
and rebuild the asr.
@neneluo Thanks, it is getting better, but there is one more warning:
se
tf.compat.v1.graph_util.remove_training_nodes INFO:absl:output_names: ['strided_slice_3'] WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_1
I refer to your way to change the second output_name to "strided_slice_3", it still show this warning.
These warnings won't affect the result. I think the main error is the mismatch of output names, so you need to change them manually in deploy/src/tensor_utils.cpp
. I mean, update line 98-99 in the file from
output_names.emplace_back(
"transformer_encoder/transformer_encoder_layer_11/layer_normalization_23/batchnorm/add_1");
to
output_names.emplace_back(
"transformer_encoder/transformer_encoder_layer_8/layer_normalization_17/batchnorm/add_1");
The other lines remain unchanged.
As you can see, the script athena/deploy_main.py
output the following logs:
INFO:absl:output_names: ['transformer_encoder/transformer_encoder_layer_8/layer_normalization_17/batchnorm/add_1', 'strided_slice_1']
INFO:absl:output_names: ['strided_slice_3']
The first line describes the output_names of the encoder and the second line describes the output_names of the decoder. For now, you may always need to change the output_names that specified in deploy/src/tensor_utils.cpp
manually according to these logs when you change model structures.
I will update the codes to make it more flexible when I have free time. Sorry for the inconvenience.
@neneluo Thanks a lot, after run .asr, I got this information: `(venv_athena) [ming@localhost build]$ ./asr
Loading model ... 2020-09-23 05:25:04.233553: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX 2020-09-23 05:25:04.268616: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3392370000 Hz 2020-09-23 05:25:04.269067: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x110cc00 executing computations on platform Host. Devices: 2020-09-23 05:25:04.269094: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version Start argmax decoding ... Argmax decoding results: Total run time of samples: 1.56261 seconds. ` does it mean I deploy successfully?
Seems something wrong. The decode results are supposed to be printing to the screen. Have you prepared vocab.txt and feats.txt and put them under deploy/graph_asr/test_data
?
@neneluo Thanks, I've checked it. there are no these files on my PC. so where to get these two files? it seems vocab.txt is here:
examples/asr/timit/data/vocab
how to get the feats.txt?
@neneluo Thanks, I've checked it. there are no these files on my PC. so where to get these two files? it seems vocab.txt is here:
examples/asr/timit/data/vocab
how to get the feats.txt?
@neneluo Thanks! I see your code, could I ask a stupid questions, how to generate the feats.txt, here is my script, say "create_feats.py": ` from athena.transform import AudioFeaturizer from athena.data import FeatureNormalizer
path = "/home/ming/athena/examples/asr/timit/data/wav/DEV/FADG0-SI649.WAV" audio_config = {"type":"Fbank", "filterbank_channel_count":40} cmvn_file = "examples/asr/timit/data/cmvn" audio_featurizer = AudioFeaturizer(audio_config) feature_normalizer = FeatureNormalizer(cmvn_file) feat = audio_featurizer(path) feat = feature_normalizer(feat, 'FADG0')
`
after running this script, how to create the feats.txt. just do this?
python create_feats.py > feats.txt
@neneluo Thanks! I see your code, could I ask a stupid questions, how to generate the feats.txt, here is my script, say "create_feats.py": `from athena.transform import AudioFeaturizer from athena.data import FeatureNormalizer
path = "/home/ming/athena/examples/asr/timit/data/wav/DEV/FADG0-SI649.WAV" audio_config = {"type":"Fbank", "filterbank_channel_count":40} cmvn_file = "examples/asr/timit/data/cmvn" audio_featurizer = AudioFeaturizer(audio_config) feature_normalizer = FeatureNormalizer(cmvn_file) feat = audio_featurizer(path) feat = feature_normalizer(feat, 'FADG0') `
after running this script, how to create the feats.txt. just do this?
python create_feats.py > feats.txt
I use numpy:
import numpy as np
np.savetxt("feats.txt", np.squeeze(feat))
@neneluo Great! Thanks!, it seems working, right?
(venv_athena) [ming@localhost build]$ ./asr Loading model ... 2020-09-25 04:23:22.372337: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX 2020-09-25 04:23:22.401870: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3392555000 Hz 2020-09-25 04:23:22.402542: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x21e2c00 executing computations on platform Host. Devices: 2020-09-25 04:23:22.402625: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version Start argmax decoding ... Argmax decoding results: silixcltsehfersfermahlaeclaxvyiynixcltiyixclperclpixsihnrixscltehcltferhherrowixclkliydxershehclpsil Total run time of samples: 2.14208 seconds.
Yes. You can check whether the result is similar to its label or the results that decoding by python to assert correctness.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue is closed. You can also re-open it if needed.
Follow the instruction here: https://github.com/athena-team/athena/blob/master/deploy/README.md
Step 4. Compiling the C++ Codes and Running the executable file, run make command, it throw exception:
Scanning dependencies of target tensor_utils [ 12%] Building CXX object CMakeFiles/tensor_utils.dir/src/tensor_utils.cpp.o [ 25%] Linking CXX static library libtensor_utils.a [ 25%] Built target tensor_utils Scanning dependencies of target utils [ 37%] Building CXX object CMakeFiles/utils.dir/src/utils.cpp.o [ 50%] Linking CXX static library libutils.a [ 50%] Built target utils Scanning dependencies of target tts [ 62%] Building CXX object CMakeFiles/tts.dir/src/tts.cpp.o [ 75%] Linking CXX executable tts /usr/bin/ld: cannot find -ltensorflow_framework collect2: error: ld returned 1 exit status