NervanaSystems / deepspeech

DeepSpeech neon implementation
Apache License 2.0
222 stars 69 forks source link

Evaluating Deep Speech 2 on Mac OSX #56

Open karllab41 opened 6 years ago

karllab41 commented 6 years ago

Hello! Thanks for posting this. I'm excited to run speech recognition on files! I've been trying to use Deep Speech 2 for evaluating my denoising pipeline. However, I'm having some trouble with the installation, and most of it is from aeon and the data loading.

When I run:

python evaluate.py --manifest val:$TOPDIR/librispeech/test-clean/test-manifest.csv --model_file $TOPDIR/model/librispeech_16_epochs.prm

I get:

Traceback (most recent call last):
  File "evaluate.py", line 21, in <module>
    from aeon.dataloader import DataLoader
ModuleNotFoundError: No module named 'aeon.dataloader'

Here's how I did my installation. I installed Neon from scratch via the original github page, which I assumed installed aeon. It did, but dataloader apparently was not in that installation. So, I went to the aeon page. The instructions told me to install aeon via:

git clone https://github.com/NervanaSystems/private-aeon.git aeon

That seemed incorrect (since private-aeon.git seems to no longer be private). So, I just installed

git clone https://github.com/NervanaSystems/aeon.git aeon

I ran into some C++ problems, so I followed Aeon Issue 48, which installed it. However, even after I installed aeon, I still couldn't import aeon.datasetloader.

import aeon.datasetloader
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-2-2f90034cee08> in <module>()
----> 1 import aeon.datasetloader

ImportError: No module named datasetloader
karllab41 commented 6 years ago

Update: I reloaded the aeon package with release version 0.2.7: https://github.com/NervanaSystems/aeon/releases/tag/v0.2.7, which I downloaded from here, and it has datasetloader. It still appears that aeon is the problem because after I type in:

python evalrun.py --manifest val:$TOPDIR/librispeech/test-clean/test-manifest.csv --model_file $TOPDIR/model/librispeech_16_epochs.prm

(The paths are correct; I checked.) The error message looks like:

DISPLAY:neon:mklEngine.so not found; falling back to cpu backend
DISPLAY:neon:mklEngine.so not found; falling back to cpu backend
2017-09-22 19:42:54,373 - neon.backends.nervanacpu - WARNING - Problems inferring BLAS info, CPU performance may be suboptimal
2017-09-22 19:42:54,374 - neon.backends - WARNING - deterministic_update and deterministic args are deprecated in favor of specifying random seed
2017-09-22 19:42:54,379 - neon.backends.nervanacpu - WARNING - Problems inferring BLAS info, CPU performance may be suboptimal
Loading model file: /Users/l41admin/Magnolia/deepspeech/model/librispeech_16_epochs.prm
formats: formats: formats: can't open input file `': No such file or directoryformats: can't open input file `': No such file or directorycan't open input file `': No such file or directory
Unable to readdecode_thread_pool exception: number of frames is negative
can't open input file `': No such file or directory

Unable to readUnable to readUnable to readdecode_thread_pool exception: number of frames is negative
decode_thread_pool exception: number of frames is negative
decode_thread_pool exception: number of frames is negative

The data that I'm using comes from running:

python data/ingest_librispeech.py $TOPDIR/librispeech/test-clean $TOPDIR/librispeech/test-clean/transcripts_dir $TOPDIR/librispeech/test-clean/test-manifest.csv

I don't know what I could be doing wrong?

wei-v-wang commented 6 years ago

@karllab41 You are doing everything right. PR #57 is addressing the initial problem. It rooted from aeon getting bumped to aeon 1.0.

karllab41 commented 6 years ago

Thanks, @wei-v-wang.

While on the Mac, I haven't gotten evaluation on Librispeech to work (with either aeon-0.2.7 or 1.0), Linux has been okay with aeon-0.2.7 (release), neon 2.1, python 2.7 (though I'd like to try 3...)

Not sure if that helps, but just FYI

SkyKingCoversGroundTiger commented 6 years ago

report: not working on Mac for both train and evaluation (with aeon 1.0). will test with Linux later :(

SkyKingCoversGroundTiger commented 6 years ago

report update: tested on Linux (Ubuntu 1604) with aeon-0.2.7 (release), neon 2.2.0, python 2.7. So far so good.

wei-v-wang commented 6 years ago

Thanks @yangroupaomo neon 2.2.0 should be automatically install aeon 1.0. In neon's virtual environment (. .venv/bin/activate), and do "pip list |grep aeon", it should be aeon-1.0, right? Please feel free to let us know of issues.

@karllab41 We released neon 2.2.0 which featured our first improvement of DS2 on IA (more improvement to come in future releases). Feel free to try the latest neon as well (git checkout latest) from neon directory. :)

SkyKingCoversGroundTiger commented 6 years ago

Yes, it is nervana-aeon (1.0.0).

Will keep working on the evaluation part tonight and hopefully it will work.

BTW, is there any workaround for the Mac Issue?

On Thu, Sep 28, 2017 at 11:18 PM, Wei Wang notifications@github.com wrote:

Thanks @yangroupaomo https://github.com/yangroupaomo neon 2.2.0 should be automatically install aeon 1.0. In neon's virtual environment (. .venv/bin/activate), and do "pip list |grep aeon", it should be aeon-1.0, right? Please feel free to let us know of issues.

@karllab41 https://github.com/karllab41 We released neon 2.2.0 which featured our first improvement of DS2 on IA (more improvement to come in future releases). Feel free to try the latest neon as well (git checkout latest) from neon directory. :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NervanaSystems/deepspeech/issues/56#issuecomment-333040760, or mute the thread https://github.com/notifications/unsubscribe-auth/AWBRGTrz1iw7LwaisBnAYBERjJQV4eLbks5snIvIgaJpZM4PhGCo .

saikishor commented 6 years ago

I am getting many errors while installing aeon and try to check for nervana-aeon; after the step : running setup.py install for nervana-aeon .../

...... In file included from /home/saikishor/Deepspeech/neon/aeon/src/block_loader_source.hpp:22:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/buffer_batch.hpp:23:

In file included from /usr/include/opencv2/core/core.hpp:58:

/usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/cstddef:51:11: error: no member named 'max_align_t' in the global namespace

using ::max_align_t;

    ~~^

In file included from /home/saikishor/Deepspeech/neon/aeon/src/box.cpp:16:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/box.hpp:19:

In file included from /usr/include/opencv2/core/core.hpp:58:

/usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/cstddef:51:11: error: no member named 'max_align_t' in the global namespace

using ::max_align_t;

    ~~^

In file included from /home/saikishor/Deepspeech/neon/aeon/src/block_manager.cpp:18:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/block_manager.hpp:21:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/buffer_batch.hpp:23:

In file included from /usr/include/opencv2/core/core.hpp:58:

/usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/cstddef:51:11: error: no member named 'max_align_t' in the global namespace

using ::max_align_t;

    ~~^

1 error generated.

clang++ -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -fstack-protector -I/usr/include/opencv -I/usr/include/python2.7 -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -c /home/saikishor/Deepspeech/neon/aeon/src/boundingbox.cpp -o build/temp.linux-x86_64-2.7/home/saikishor/Deepspeech/neon/aeon/src/boundingbox.o -O3 -std=c++11 -Werror=return-type -Werror=inconsistent-missing-override -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-padded -Wno-weak-vtables -Wno-global-constructors -Wno-switch-enum -Wno-gnu-zero-variadic-macro-arguments -Wno-undef -Wno-exit-time-destructors -Wno-missing-prototypes -Wno-disabled-macro-expansion -Wno-pedantic -Wno-documentation -Wno-covered-switch-default -Wno-old-style-cast -Wno-unknown-warning-option -Wno-sign-compare -Wno-unused-parameter -Wno-conversion -Wno-float-equal -Wno-duplicate-enum -Wno-used-but-marked-unused -Wno-c++11-compat-deprecated-writable-strings -Wno-deprecated -Wno-double-promotion -DPYTHON_FOUND

error: command 'clang++' failed with exit status 1

1 error generated.

1 error generated.

1 error generated.

1 error generated.

In file included from /home/saikishor/Deepspeech/neon/aeon/src/boundingbox.cpp:17:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/etl_boundingbox.hpp:21:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/interface.hpp:27:

In file included from /home/saikishor/Deepspeech/neon/aeon/src/typemap.hpp:19:

In file included from /usr/include/opencv2/core/core.hpp:58:

/usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/cstddef:51:11: error: no member named 'max_align_t' in the global namespace

using ::max_align_t;

    ~~^

1 error generated.

1 error generated.

1 error generated.

1 error generated.


Cleaning up... Command /usr/bin/python -c "import setuptools, tokenize;file='/tmp/pip-ybSgIX-build/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-lUO1rj-record/install-record.txt --single-version-externally-managed --compile failed with error code 1 in /tmp/pip-ybSgIX-build Storing debug log for failure in /home/saikishor/.pip/pip.log

wei-v-wang commented 6 years ago

It was not quite eye-catching but "http://neon.nervanasys.com/index.html/installation.html" has a suggestion when encountering aeon related issues. "If you have encountered error messages about failing to install aeon while building neon, please visit aeon page for how to install prerequisites for aeon to enable neon with aeon data loader."

To be more clear, the above seems to be related to clang and could you try following aeon readme to install all pre-requisites? aeon is part of neon but neon does not list (automatically install) all aeon pre-requisites.

https://github.com/NervanaSystems/aeon/blob/master/README.md

saikishor commented 6 years ago

Yes the problem is definitely with clang, but it's very hard on resolve. I am using Ubuntu so the prerequisites installation is only:

apt-get install git clang cmake python-dev python-pip libcurl4-openssl-dev libopencv-dev libsox-dev

Followed by normal installation "cmake" of aeon.

Do you have any idea how to solve it?. The clang is using gcc++ 4.9

saikishor commented 6 years ago

The version of the clang is 3.4 and it's using gcc and gcc++ of 4.9 as default.

Is there any possible way to resolve it?

wei-v-wang commented 6 years ago

any possibility of upgrading clang from 3.4 to 3.5? Google search on "max_align_t" had suggestions along these lines.

saikishor commented 6 years ago

I tried installing clang to 3.6 after removing 3.4 and tried to run the commands, but ended up with the same at the end. I tried about max_align_t on Google, but there are many solutions with clang, but they only propose to run for individual files and they didn't explain well how to do for a Cmake file.

wei-v-wang commented 6 years ago

FYI: https://github.com/NervanaSystems/neon/issues/375 had some suggestion with using libc++

Can you try the fix in the above 375? That was a Mac system in 375 fix.

saikishor commented 6 years ago

Thanks I tried but, I have a question I didn't find env.sh in aeon package folder, so I didn't find a way to proceed further!!!

saikishor commented 6 years ago

I am using Ubuntu 14.04 system

wei-v-wang commented 6 years ago

Oh, sorry the 375 issue must have been for old aeon. Let me take a closer look. (I am not asking you to upgrade to Ubuntu 16.04). I will let you know what version of gcc and clang I am using on Ubuntu 16.04 and see if we can arrive at a solution.

saikishor commented 6 years ago

Thanks you for your effort.. @wei-v-wang

wei-v-wang commented 6 years ago

First, sorry to others if we are discussing neon installation issues on deep speech :)

@saikishor I am using a similar system as yours (Ubuntu 16.04) and clang 3.8 and gcc5.4 The following shows what a working aeon installation is like (after type 'make' under neon and extracting aeon related output log)

HEAD is now at 7e1af03... Merge for v1.0.0 release. The number changed from v1.0.1 to v1.0.0 because v1.0.0 has never been released. -- The C compiler identification is GNU 5.4.0 -- The CXX compiler identification is Clang 3.8.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/clang++ -- Check for working CXX compiler: /usr/bin/clang++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1") -- Checking for module 'sox' -- Found sox, version 14.4.1 -- Found CURL: /usr/lib/x86_64-linux-gnu/libcurl.so (found version "7.47.0") -- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.5m.so (found version "3.5.2") -- Found PythonInterp: /usr/bin/python3.5 (found version "3.5.2") -- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE
-- Could NOT find LATEX (missing: LATEX_COMPILER) -- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE) -- Found Sphinx: /home/weiwang/git/neon/.venv2/bin/sphinx-build
-- Failed to locate breathe executable (missing: BREATHE_EXECUTABLE) Doxygen not found, skipping documentation Breathe not found, skipping documentation Without COVERAGE flag coverage raport is unavailable -- Configuring done -- Generating done -- Build files have been written to: /home/weiwang/git/neon/aeon/build Processing /home/weiwang/git/neon/aeon/build Installing collected packages: nervana-aeon Running setup.py install for nervana-aeon ... done Successfully installed nervana-aeon-1.0.0

saikishor commented 6 years ago

Sorry to others from my side as well. Installing collected packages: nervana-aeon Running setup.py install for nervana-aeon ... I am exactly failing after this step. I don't know why. The only difference we had is our default gcc version used by clang. It is not clear over internet how to set your default gcc in clang. So, I feel totally cornered at this point.

wei-v-wang commented 6 years ago

OK, sorry for the frustrating experience. Here hopefully is a better suggestion: I just tested on a Ubuntu 14.04 system VERSION="14.04.5 LTS, Trusty Tahr"

-- The C compiler identification is GNU 4.8.4 -- The CXX compiler identification is Clang 3.4.0

would also work.

So can you completely remove gcc-4.9 and install gcc-4.8? Afterwards, can you try install clang 3.4?

saikishor commented 6 years ago

Sure @wei-v-wang. I will try and keep you updated. I have one more question. I have many versions of gcc installed, should I uninstall all of them?. The problem is when I tried to uninstall gcc-4.9 I got to see that cuda and some GPU parts were also warned to be uninstalled as it is part of that.

Do you think uninstalling and installing all gcc by gcc 4.8, will not create any issue to my cuda?. I will surely give a try.

wei-v-wang commented 6 years ago

I am not sure whether changes to gcc will affect cuda. Can you try keeping all GCC versions and make gcc4.8 the default in the PATH and LD_LIBRARY_PATH?

saikishor commented 6 years ago

Yes sure. I will try to do that and keep you posted.

saikishor commented 6 years ago

I gave up on neon, i tried many suggestions from stackoverflow and other places, couldn't resolve my issue. Thanks for all your help @wei-v-wang

wei-v-wang commented 6 years ago

Hi @saikishor I understand your frustration regarding gcc versions and clang version. Did "GNU 4.8.4 and Clang 3.4.0" not help? Have you tried Docker? Are you willing to hear about setting up neon in docker so it will be an isolated environment?

Also, neon is evolving, we encourage you to try it later on while we improve the installation experience. However, it is likely there would be difficult corner cases to handle, e.g. the gcc/clang related issue tied to the operating system.

saikishor commented 6 years ago

@wei-v-wang Yes yes docker is an option!!!, but where can I find the info about setting up neon on docker?

wei-v-wang commented 6 years ago

Hi @saikishor I will contact my team for the instructions on the docker option. Please stay tuned.

saikishor commented 6 years ago

Wow!!! @wei-v-wang Thanks for your generous help.... I would like to mention that docker is one of the best options for anyone to opt, as it enables the user to train on multiple GPU's and test on multiple GPU's. Mainly, there is a huge load on CPU while evaluating, which take lot of time to process and making it hard to implement in real-time applications.

Thank you once again for an initiative on docker option.

wei-v-wang commented 6 years ago

@saikishor You are welcome. Have you considered trying training on Intel Xeon Scalable Processor Family (Skylake)? Please keep this in mind as an option/alternative to multi GPU -- you may be surprised to find what multi-Skylake could give you in terms of training processing :)

saikishor commented 6 years ago

@wei-v-wang Surely I will consider Intel Xeon Scalable Processor Family (Skylake) for training, but more researchers are very interested on its evaluation peformance on CPU's as this will drive the whole neon into real-time applications.

wei-v-wang commented 6 years ago

@saikishor Good point! I should have asked you to consider Skylake for both training and inference, :)

wsokolow commented 6 years ago

Hi @saikishor , below I'm providing you instruction how to set and run Neon + Deepspeech inside Docker container:

Build docker image using below dockerfile:

Ubuntu-14.04.txt

docker build --rm -f=Ubuntu-14.04.txt -t=neon:test .

Run your docker container:

docker run -it --name neon_test neon:test /bin/bash


To run Deepspeech training on Neon, while inside docker container, follow below steps:

1. Install Neon 2.2

git clone https://github.com/NervanaSystems/neon.git cd neon make -j . ./.venv2/bin/activate

2. Install Deepspeech

git clone https://github.com/NervanaSystems/deepspeech.git cd deepspeech pip install -r requirements.txt make -j

3. Prepare Deepspeech datasets and run the training

Please follow instructions described in https://github.com/NervanaSystems/deepspeech/blob/master/README.md , paragraph "Training a model". You will need to download train and val datasets, ingest them to generate .csv manifests files and run the training.

IMPORTANT: You might need to additionally install scipy package (pip install scipy)

Example training commandline using GPU backend:

python train.py --manifest train:/root/output/train-clean-100/train-manifest.csv \ --manifest val:/root/output/dev-clean/val-manifest.csv -e 2 -z 8 -b gpu-s model_output.pkl

This will run 2 epochs on GPU backend, use batch size 8 and save model to "model_output.pkl" file.

Let me know if you encounter any issues, I will help.

saikishor commented 6 years ago

Thank you @wsokolow that was a fast response. I need some time to try this, after that i will let you know.

pzelazko-intel commented 6 years ago

@saikishor I investigated extactly the same problem as yours recently. What I found is that the root cause is clang error fixed in version 3.4-2: https://reviews.llvm.org/rL201729 https://stackoverflow.com/questions/23462950/clang-only-compiles-c11-program-using-boostformat-when-std-c11-option-i

This problem does not reproduce with gcc 4.8, but does with gcc 4.9. AFAIK gcc 4.8 is default version for ubuntu 14.04, so I suppose you had to update it.

I see you wrote that it didnt help for you to install clang 3.5 - that's strange. Maybe try upgrading to 3.8 or downgrading gcc to 4.8.