Closed inchpunch closed 5 years ago
Are you building TF inside some container? I see that python can't find numpy. Before you start building TF from source: 1) make sure that you have Python 3.5 or 3.6. 2) install all requirements: $ cd OpenSeq2Seq $ pip install -r requirements.txt
Yes I am building inside the container following the Step 7 in the installation instructions (https://nvidia.github.io/OpenSeq2Seq/html/installation.html):
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -it --rm nvcr.io/nvidia/tensorflow:18.12-py3
Also, numpy is clearly installed as I can use "import numpy" and "numpy.version.version". Why would it have problem in getting numpy include path?
You don't have to rebuild TF is you are inside nvidia TF container.
Thanks for your suggestion. I just did "pip install -r requirements.txt", but still see the same results when running that "bazel build ..."'.
Except the tensorflow download and configure, how do I go through the rest of Step 3 to finish installing the CTC decoder with language model?
I thought the "bazel build ... "s purpose is to generate the "//ctc_decoder_with_lm:generate_trie" at the end of that command, as later on in "How to download a language model for a CTC decoder (optional)" , we need to run ./scripts/download_lm.sh,
in which the last command in this script is using that "generate_trie"
../ctc_decoder_with_lm/generate_trie ../open_seq2seq/test_utils/toy_speech_data/vocab.txt ./4-gram.binary ./trie_vocab.txt ./trie.binary
Nvidia TF container already have pre-build ctc_decoder_with_lm op, so you don't need to rebuild TF from source in this case. But you still need to download and build kenLM.
I did download and build kenLM, following ./scripts/install_kenlm.sh, but by that I did not see "generate_trie" in the "ctc_decoder_with_lm" folder. There is only generate_trie.cpp, so I guess it still needs to be built?
Do you see link to kenlm directory under ctc_decoder_with_lm folder?
Yes.
I mean, I see the link to kenlm directory under ctc_decoder_with_lm, but it looks I still need "generate_trie" to run "./scripts/download_lm.sh" later on. Is that true?
Is the original command supposed to run outside the Nvidia TF container? If so is there a simple way to convert this command to run inside the container?
bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --copt=-O3 --config=cuda //tensorflow/tools/pip_package:build_pip_package //tensorflow:libtensorflow_cc.so //tensorflow:libtensorflow_framework.so //ctc_decoder_with_lm:libctc_decoder_with_kenlm.so //ctc_decoder_with_lm:generate_trie
First you should install kenlm: ./scripts/install_kenlm.sh Then donwload_lm script will call kenlm to generate trie. Script should work inside container Please check that kenlm/build/bin has build_binary. It it is not there please re-install kenlm.
You don't have to rebuild tensorflow (bazel build ....) if you are inside nvidia tf container, since this container already have been build with decoder_with_kenlm.so
I did install kenlm, but when I call ./scripts/download_lm.sh, at the end it says "./scripts/download_lm.sh: line 20: ../ctc_decoder_with_lm/generate_trie: No such file or directory". The output is as below
root@e42704494d43:/workspace/OpenSeq2Seq# ./scripts/download_lm.sh --2019-03-19 00:32:23-- http://www.openslr.org/resources/11/4-gram.arpa.gz Connecting to 105.128.219.200:8080... connected. Proxy request sent, awaiting response... 200 OK Length: 1355172078 (1.3G) [application/x-gzip] Saving to: ‘4-gram.arpa.gz’
4-gram.arpa.gz 100%[==============================================================================>] 1.26G 462KB/s in 6m 17s
2019-03-19 00:38:55 (3.42 MB/s) - ‘4-gram.arpa.gz’ saved [1355172078/1355172078]
gzip: 4-gram.arpa already exists; do you wish to overwrite (y or n)? y Reading 4-gram-lower.arpa ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
Identifying n-grams omitted by SRI ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
Quantizing ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
Writing trie ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
SUCCESS ./scripts/download_lm.sh: line 20: ../ctc_decoder_with_lm/generate_trie: No such file or directory
In my ctc_decoder_with_lm folder, I do not have "libctc_decoder_with_kenlm.so" either.
I am wondering, when using the Nvidia TF container, in the general installation (https://nvidia.github.io/OpenSeq2Seq/html/installation.html), which step is supposed to build "libctc_decoder_with_kenlm.so" and which step is supposed to build "generate_trie"?
I am sorry but this issue still exists in my system.
./scripts/install_kenlm.sh was successful, and kenlm/build/bin has build_binary. But in ./ctc_decoder_with_lm there is no generate_trie, only the generate_trie.cpp.
Can you tell me how to make "generate_trie" inside the docker?
Does it work inside nvdocker with nvidia Tensorflow container?
We have a new faster script for beam search with LM that does not require you to build ctc_decoders and does require a prefix trie. As long as you have build_binary, you should be able to build a lm binary for use with the new script.
See the README here https://github.com/NVIDIA/OpenSeq2Seq/tree/master/external_lm_rescore and try steps 1-3.
@vsl9, Do you have any additional comments on this?
@borisgin It did not inside the nvdocker. Actually I was installing kenlm inside nvidia tensorflow container.
@blisc Thanks for your information. I will check it out.
First you should install kenlm: ./scripts/install_kenlm.sh Then donwload_lm script will call kenlm to generate trie. Script should work inside container Please check that kenlm/build/bin has build_binary. It it is not there please re-install kenlm.
You don't have to rebuild tensorflow (bazel build ....) if you are inside nvidia tf container, since this container already have been build with decoder_with_kenlm.so
@borisgin I have build_binary
inside kenlm/build/bin
and have then downloaded the language model after which my language_model/ folder only contained 4-gram.arpa.gz
which is strange to me. Both the install_kenlm.sh
and download_lm.sh
did not throw any errors for me. Also in the model config the decoder params section contains:
{ "decoder_library_path": "OpenSeq2Seq/ctc_decoder_with_lm/libctc_decoder_with_kenlm.so", "lm_path": "language_model/4-gram.binary", "trie_path": "language_model/trie.binary" }
But I can't find any of these included files inside my repo. Your thoughts would help me.
We have a new faster script for beam search with LM that does not require you to build ctc_decoders and does require a prefix trie. As long as you have build_binary, you should be able to build a lm binary for use with the new script.
See the README here https://github.com/NVIDIA/OpenSeq2Seq/tree/master/external_lm_rescore and try steps 1-3.
@vsl9, Do you have any additional comments on this?
@blisc I have a trained w2lp model. Is it possible to use this re-scoring directly on it by just downloading the language model alone? Can I use it in interactive_infer
mode as well ?
All you need should be just the a 4-gram.binary
file for the new decoder. You should be able to call any python script in interactive_infer so I don't see why not. You would have to modify decode.py.
All you need should be just the a 4-gram.binary file for the new decoder. You should be able to call any python script in interactive_infer so I don't see why not. You would have to modify decode.py. @blisc hello,have you ever successfully runned scripts/decode.py? When I run ./scripts/download_lm.sh I got the same error as @inchpunch at the last line because in ./ctc_decoder_with_lm there is no generate_trie, only the generate_trie.cpp.Then I ignored that. As you say I do step 1-3 in https://github.com/NVIDIA/OpenSeq2Seq/tree/master/external_lm_rescore and try steps 1-3, at step 3 I got the error ImportError: No module named 'ctc_decoders'
It seems that if I modify decode.py it will work ,can you tell how should I modify it or find any steps I missed. Thank you
All you need should be just the a 4-gram.binary file for the new decoder. You should be able to call any python script in interactive_infer so I don't see why not. You would have to modify decode.py. @blisc hello,have you ever successfully runned scripts/decode.py? When I run ./scripts/download_lm.sh I got the same error as @inchpunch at the last line because in ./ctc_decoder_with_lm there is no generate_trie, only the generate_trie.cpp.Then I ignored that. As you say I do step 1-3 in https://github.com/NVIDIA/OpenSeq2Seq/tree/master/external_lm_rescore and try steps 1-3, at step 3 I got the error ImportError: No module named 'ctc_decoders'
It seems that if I modify decode.py it will work ,can you tell how should I modify it or find any steps I missed. Thank you
@xw1324832579 You need to run scripts/install_decoders.sh
script before running scripts/decode.py
All you need should be just the a 4-gram.binary file for the new decoder. You should be able to call any python script in interactive_infer so I don't see why not. You would have to modify decode.py. @blisc hello,have you ever successfully runned scripts/decode.py? When I run ./scripts/download_lm.sh I got the same error as @inchpunch at the last line because in ./ctc_decoder_with_lm there is no generate_trie, only the generate_trie.cpp.Then I ignored that. As you say I do step 1-3 in https://github.com/NVIDIA/OpenSeq2Seq/tree/master/external_lm_rescore and try steps 1-3, at step 3 I got the error ImportError: No module named 'ctc_decoders'
It seems that if I modify decode.py it will work ,can you tell how should I modify it or find any steps I missed. Thank you
@xw1324832579 You need to run scripts/install_decoders.sh script before running scripts/decode.py
Yes, I do steps as you say, and it seems that in scripts/ there is no ctc_decoders.py but in decoders/ there is.Should I change the scripts/decode.py ?
I followed the steps in https://nvidia.github.io/OpenSeq2Seq/html/installation.html#installation-speech and proceeded to the middle of Step #3
But the command after ./configure gave these errors. Can anyone point out what to fix?
root@3911f9353aef:/workspace/tensorflow# bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --copt=-O3 --config=cuda //tensorflow/tools/pip_package:build_pip_package //tensorflow:libtensorflow_cc.so //tensorflow:libtensorflow_framework.so //ctc_decoder_with_lm:libctc_decoder_with_kenlm.so //ctc_decoder_with_lm:generate_trie WARNING: detected http_proxy set in env, setting no_proxy for localhost. WARNING: The following configs were expanded more than once: [cuda]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior. WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". INFO: Invocation ID: d3284369-02b5-4cd8-a2c2-a748ada9bb8b ERROR: /workspace/tensorflow/third_party/python_runtime/BUILD:5:1: no such package '@local_config_python//': Traceback (most recent call last): File "/workspace/tensorflow/third_party/py/python_configure.bzl", line 308 _create_local_python_repository(repository_ctx) File "/workspace/tensorflow/third_party/py/python_configure.bzl", line 272, in _create_local_python_repository _get_numpy_include(repository_ctx, python_bin) File "/workspace/tensorflow/third_party/py/python_configure.bzl", line 256, in _get_numpy_include _execute(repository_ctx, [python_bin, "-c",..."], <2 more arguments>) File "/workspace/tensorflow/third_party/py/python_configure.bzl", line 55, in _execute _fail("\n".join([error_msg.strip() if ... ""])) File "/workspace/tensorflow/third_party/py/python_configure.bzl", line 28, in _fail fail(("%sPython Configuration Error:%...))) Python Configuration Error: Problem getting numpy include path. Traceback (most recent call last): File "", line 1, in
ImportError: No module named numpy
Is numpy installed?
and referenced by '//third_party/python_runtime:headers'
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted: Analysis failed
INFO: Elapsed time: 5.625s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (208 packages loaded, 3370 targets configured)
currently loading: tensorflow/core ... (23 packages)
Fetching @org_sqlite; fetching
Fetching @six_archive; fetching
Fetching @gast_archive; fetching
Fetching @absl_py; fetching
Fetching @swig; fetching
Fetching @org_python_pypi_backports_weakref; fetching
Fetching @keras_applications_archive; fetching
Fetching @local_config_python; fetching ... (10 fetches)