NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.54k stars 369 forks source link

scripts/get_en_de.sh requirese sentencepiece which fails to build due to stubs/strutil.h not installed with cmake #318

Closed David-Levinthal closed 5 years ago

David-Levinthal commented 5 years ago

this is a bit wierd invoking scripts/get_en_de.sh fails with Input sentences: 4562102 Output sentences: 4524868 Shuffling TOKENIZATION Traceback (most recent call last): File "tokenizer_wrapper.py", line 9, in import sentencepiece as spm ImportError: No module named sentencepiece in spite of having installed sentencepiece with this requirements.txt file installing sentencepice from source git clone https://github.com/google/sentencepiece.git etc..reuires installing protobuf and git clone https://github.com/gperftools/gperftools.git which builds fine..but sentencepiece won't build due to /usr/local/include/google/protobuf/message_lite.h:49:43: fatal error: google/protobuf/stubs/strutil.h: No such file or directory which is https://github.com/facebookresearch/Detectron/issues/758 and other related reports I am not seeing the way out of the knot :-)

vsl9 commented 5 years ago

If you have already installed sentencepiece with pip, can you please check your Python version (and path)? Maybe, it was installed for another python interpreter (another version, another location).

David-Levinthal commented 5 years ago

I found a "solution"..as I had protobuf downloaded I manually installed the header file ~/protobuf/src/google/protobuf/stubs$ sudo cp strutil.h /usr/local/include/google/protobuf/stubs/

but the real issue is that I have installed python3.6 onto ubuntu 16.04 manually into /usr and have not made that python version the default. (python2.7 is the default and it does not have sentencepiece) this can be closed