mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Mozilla Public License 2.0
25.17k stars 3.95k forks source link

ArchLinux PKGBUILDs for native client and python bindings #979

Open stes opened 6 years ago

stes commented 6 years ago

I created (unofficial) PKGBUILD files for Arch Linux, which can be downloaded here:

https://github.com/stes/arch-deepspeech

If desired to include such files in the main repository or at the Arch User Repository, I am happy to submit a pull request.

lissyx commented 6 years ago

Thanks for that @stes !

I had a quick look, and I think that if you want to include that into AUR, however, you might require much more work. I guess it's not a good idea to package binaries built somewhere else, so I think the best course of action would be to rebuilt for Arch on your side, including Tensorflow.

Which main repo are you referring to, Mozilla's one or ArchLinux' one?

lissyx commented 6 years ago

@stes I see there's ArchLinux on (Dockerhub)[https://hub.docker.com/r/base/archlinux/], so you can have that as a base image on TaskCluster. So maybe you could do a PR that includes your work to build and produce an ArchLinux package ?

stes commented 6 years ago

Hello @lissyx , thanks for responding that fast!

I had a quick look, and I think that if you want to include that into AUR, however, you might require much more work. I guess it's not a good idea to package binaries built somewhere else, so I think the best course of action would be to rebuilt for Arch on your side, including Tensorflow.

Actually the packages I provide in the repository are build by me for Arch Linux (using your "official" build instructions) to make sure to compile against the most recent version of libraries etc.

So maybe you could do a PR that includes your work to build and produce an ArchLinux package?

I looked at the dockerhub images, that seems to be the right way of building the binaries in the future. I suggest to wait until the pre-trained models are up and I am happy with the packaging process on my own machine.

lissyx commented 6 years ago

@stes Perfect, I don't know ArchLinux very well, and reading the Makefile or PKGBUILD I could not find where you perform the actual build :). If you do a PR against our TaskCluster changes to produce packages, you'll need to flag me or @reuben as a reviewer first ; PR from non collaborators cannot trigger TaskCluster process (for security reasons), so we need to take a first look and trigger for you (for now). We can help you for that part, though.

lissyx commented 6 years ago

@stes Would you like to make a PR that at least links to where people can find your packages? We could add that in the README, like we did for Rust and Go bindings.

stes commented 6 years ago

@lissyx yes of course, added #1109 I put it just under the Rust and Go bindings, although it is technically not about bindings, but I wanted to prevent clutter in the README. I can update the PKGBUILD in the next days to use the most recent version of the deep speech model and also include a download procedure for pre-trained models. I wanted to let the release settle a bit before packaging anything.

Once that works, I can look into the TaskCluster build (for that I will probably approach you again).

lissyx commented 6 years ago

Thanks! We should be doing a dot release soon, I hope :)

NicoHood commented 6 years ago

I currently try to fix the deepspeech PKGBUILD on AUR with the latest version. The problem is that the readme tagged with 0.1.1 is outdated and the update in the master is to new for version 0.1.1. So I had to guess some build options, but it still fails. Can anyone please help me how to properly build 0.1.1? The previous version compiled fine (but with some security problems of the binary itsef).

# Maintainer: Jonas Heinrich <onny@project-insanity.org>
# Contributor: NicoHood <archlinux {cat} nicohood {dog} de>

pkgname=deepspeech
pkgver=0.1.1
pkgrel=1
pkgdesc="A TensorFlow implementation of Baidu's DeepSpeech architecture"
arch=('x86_64')
url="https://github.com/mozilla/DeepSpeech"
license=('MPL2')
makedepends=('bazel' 'python-numpy' 'python-pip' 'python-wheel' 'python-setuptools' 'git')
depends=('python-tensorflow' 'python-scipy' 'sox' 'gcc-libs')
source=("deepspeech-${pkgver}.tar.gz::https://github.com/mozilla/DeepSpeech/archive/v${pkgver}.tar.gz"
        "git+https://github.com/mozilla/tensorflow.git") #TODO use fixed git commit/version
sha512sums=('63a5b73fe5b294b97b029e963a3c76f73e6c0d39895135c8ddc6eac502dcae0fe32e6babed55c3308add72e6d195f7a994d40eb4c149a54d9dcc3a017a6c28c8'
            'SKIP')
# TODO gpg signatures

# TODO add models as extra/split package
# TODO add python bindings

prepare() {
  cd "$srcdir/tensorflow"
  # These environment variables influence the behavior of the configure call below.
  export PYTHON_BIN_PATH=/usr/bin/python
  export USE_DEFAULT_PYTHON_LIB_PATH=1
  export TF_NEED_JEMALLOC=1
  export TF_NEED_GCP=0
  export TF_NEED_HDFS=0
  export TF_NEED_S3=0
  export TF_ENABLE_XLA=1
  export TF_NEED_GDR=0
  export TF_NEED_VERBS=0
  export TF_NEED_OPENCL=0
  export TF_NEED_MPI=0
  ln -sf ../DeepSpeech-${pkgver}/native_client ./
}

build() {
  cd "$srcdir/tensorflow"
  export CC_OPT_FLAGS="-march=x86-64"
  export TF_NEED_CUDA=0
  ./configure
  bazel build -c opt --copt=-O3 //native_client:libctc_decoder_with_kenlm.so
  bazel build --config=monolithic -c opt --copt=-O3 --copt=-fvisibility=hidden \
    //tensorflow:libtensorflow_cc.so \
    //tensorflow:libtensorflow_framework.so \
    //native_client:deepspeech \
    //native_client:deepspeech_utils \
    //native_client:generate_trie

  # bazel build -c opt --copt=-O3 //tensorflow:libtensorflow_cc.so \
  #                                 //tensorflow:libtensorflow_framework.so \
    #                           //native_client:deepspeech \
    #                           //native_client:deepspeech_utils \
    #                           //native_client:ctc_decoder_with_kenlm \
    #                           //native_client:generate_trie

  cd "${srcdir}/DeepSpeech-${pkgver}/native_client"
  make deepspeech
}

package() {
  cd "${srcdir}/DeepSpeech-${pkgver}/native_client"
  PREFIX="${pkgdir}/usr" make install
}
...
INFO: From Compiling native_client/generate_trie.cpp:
In file included from native_client/generate_trie.cpp:7:0:
native_client/trie_node.h:29:28: warning: multi-character character constant [-Wmultichar]
   static const int MAGIC = 'TRIE';
                            ^~~~~~
INFO: Elapsed time: 3278.809s, Critical Path: 158.85s
INFO: Build completed successfully, 3590 total actions
c++ -o deepspeech   `pkg-config --cflags sox` client.cc  -Wl,--no-as-needed -Wl,-rpath,\$ORIGIN -L/build/deepspeech/src/tensorflow/bazel-bin/tensorflow -L/build/deepspeech/src/tensorflow/bazel-bin/native_client  -ldeepspeech -ldeepspeech_utils -ltensorflow_cc -ltensorflow_framework  `pkg-config --libs sox`
/tmp/ccsWte2Z.o: In function `LocalDsSTT(DeepSpeech::Model&, short const*, unsigned long, int)':
client.cc:(.text+0x94): undefined reference to `DeepSpeech::Model::getInputVector(short const*, unsigned int, int, float**, int*, int*)'
client.cc:(.text+0xb9): undefined reference to `DeepSpeech::Model::infer(float*, int, int)'
/tmp/ccsWte2Z.o: In function `main':
client.cc:(.text+0x22e): undefined reference to `DeepSpeech::Model::Model(char const*, int, int, char const*, int)'
client.cc:(.text+0x288): undefined reference to `DeepSpeech::Model::enableDecoderWithLM(char const*, char const*, char const*, float, float, float)'
client.cc:(.text+0x7bc): undefined reference to `DeepSpeech::Model::~Model()'
client.cc:(.text+0x7e1): undefined reference to `DeepSpeech::Model::~Model()'
collect2: error: ld returned 1 exit status
make: *** [Makefile:22: deepspeech] Error 1
==> ERROR: A failure occurred in build().
    Aborting...
==> ERROR: Build failed, check /var/lib/archbuild/extra-x86_64/arch/build
lissyx commented 6 years ago

@NicoHood Please stick to v0.1.1, and document exactly your issues. From what I'm reading, you are mixing v0.1.1 with master TensorFlow ? Please use r1.4 branch from mozilla/tensorflow with DeepSpeech v0.1.1

NicoHood commented 6 years ago

I am happy to use a fixed tensorflow version/branch. The problem was I did not know about this branch fits to 0.1.1. Where can I find this information for future builds?

This branch fails at the version check:

==> Starting build()...
Extracting Bazel installation...
You have bazel 0.10.1- (@non-git) installed.
Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished
............
Loading: 
Loading: 0 packages loaded
Loading: 0 packages loaded
ERROR: /build/deepspeech/src/tensorflow/WORKSPACE:15:1: Traceback (most recent call last):
    File "/build/deepspeech/src/tensorflow/WORKSPACE", line 15
        closure_repositories()
    File "/build/.cache/bazel/_bazel_builduser/1f26581d0edfc50ffeb635c4dee8caad/external/io_bazel_rules_closure/closure/repositories.bzl", line 69, in closure_repositories
        _check_bazel_version("Closure Rules", "0.4.5")
    File "/build/.cache/bazel/_bazel_builduser/1f26581d0edfc50ffeb635c4dee8caad/external/io_bazel_rules_closure/closure/repositories.bzl", line 172, in _check_bazel_version
        fail(("%s requires Bazel >=%s but was...)))
Closure Rules requires Bazel >=0.4.5 but was 0.10.1- (@non-git)
ERROR: Error evaluating WORKSPACE file
ERROR: /build/deepspeech/src/tensorflow/WORKSPACE:41:1: Traceback (most recent call last):
    File "/build/deepspeech/src/tensorflow/WORKSPACE", line 41
        tf_workspace()
    File "/build/deepspeech/src/tensorflow/tensorflow/workspace.bzl", line 146, in tf_workspace
        check_version("0.5.4")
    File "/build/deepspeech/src/tensorflow/tensorflow/workspace.bzl", line 56, in check_version
        fail("\nCurrent Bazel version is {}, ...))

Current Bazel version is 0.10.1- (@non-git), expected at least 0.5.4
ERROR: Error evaluating WORKSPACE file
ERROR: Skipping '//native_client:libctc_decoder_with_kenlm.so': error loading package 'external': Package 'external' contains errors
WARNING: Target pattern parsing failed.
ERROR: error loading package 'external': Package 'external' contains errors
INFO: Elapsed time: 1.351s
FAILED: Build did NOT complete successfully (0 packages loaded)
==> ERROR: A failure occurred in build().
    Aborting...
==> ERROR: Build failed, check /var/lib/archbuild/extra-x86_64/arch/build
lissyx commented 6 years ago

@NicoHood This is an upstream TensorFlow issue, you should try lower versions of Bazel. We sticked to 0.5.4 for some time, and this was working with this specific branch, and I explicitely remember that some people were running into issues back in the days with Bazel ~0.7.

lissyx commented 6 years ago

@NicoHood I know it might be inconvenient, this is also why I've opened https://github.com/mozilla/DeepSpeech/issues/1253 and related upstream issues to see how we can improve stuff. In the specific case of Bazel versions, you should refer to TensorFlow instructions as we link them in our README: https://github.com/mozilla/DeepSpeech/blob/master/native_client/README.md#building

NicoHood commented 6 years ago

Hm, this gets too complicated for what its worth then. What about the master branch of deepspeech, with which version of tensorflow can I compile this? 1.5?

Also we have tensorflow 1.5 in our official repositories, can I somehow reuse those compiled .so files? Compiling everything takes extremely long.

What about the changes you made in your mozilla branch? Can you push them generic to upstream so no special fork is requried?

lissyx commented 6 years ago

@NicoHood Yes, current master is bound with r1.5. You need to rebuild, because we switched to monolithic builds. Our changes are easy to find: it's mostly about RPi3 cross-compilation and tfcompile. If you use upstream TensorFlow, it's going to choke on some definitions. Pushing this to upstream is not that trivial ...

lissyx commented 6 years ago

@NicoHood Besides, I don't see what is complicated, just use a local install of Bazel v0.5.4 and that should work, playing with bazel's --output_user_root and --output_base.

lissyx commented 6 years ago

@NicoHood If you want to stick to upstream, you can just patch native_client/BUILD file to remove the definitions of deepspeech_model_core, tfcompile.config, tfcompile.model and libdeepspeech_model.so.

Building from scratch only our stuff (CPU build) is completed in about 600-800 secs on my desktop (i7-4790K).

NicoHood commented 6 years ago

How would I start the build then? I installed tensorflow and modfied BUILD like this (not sure if that was correct)

diff --git a/native_client/BUILD b/native_client/BUILD
index 5d001c9..1f4a061 100644
--- a/native_client/BUILD
+++ b/native_client/BUILD
@@ -15,40 +15,9 @@ config_setting(
     }
 )

-tf_library(
-    name = "deepspeech_model_core",
-    cpp_class = "DeepSpeech::nativeModel",
-    # We don't need tests or benchmark binaries
-    gen_test=False, gen_benchmark=False,
-    # graph and config will be generated at build time thanks to the matching
-    # genrule.
-    graph = "tfcompile.model.pb",
-    config = "tfcompile.config.pbtxt",
-    # This depends on //tensorflow:rpi3 condition defined in mozilla/tensorflow
-    tfcompile_flags = select({
-        "//tensorflow:rpi3": str('--target_cpu="cortex-a53"'),
-        "//conditions:default": str('')
-    }),
-)
-
-genrule(
-    name = "tfcompile.config",
-    srcs = ["tfcompile.config.pbtxt.src"],
-    outs = ["tfcompile.config.pbtxt"],
-    cmd = "$(location :model_size.sh) $(SRCS) $(DS_MODEL_TIMESTEPS) $(DS_MODEL_FRAMESIZE) >$@",
-    tools = [":model_size.sh"]
-)
-
-genrule(
-    name = "tfcompile.model",
-    outs = ["tfcompile.model.pb"],
-    cmd = "cp $(DS_MODEL_FILE) $@"
-)
-
 tf_cc_shared_object(
     name = "libdeepspeech.so",
     srcs = ["deepspeech.cc", "deepspeech.h", "deepspeech_utils.h", "alphabet.h", "beam_search.h", "trie_node.h"] +
-           if_native_model(["deepspeech_model_core.h"]) +
            glob(["kenlm/lm/*.cc", "kenlm/util/*.cc", "kenlm/util/double-conversion/*.cc",
                  "kenlm/lm/*.hh", "kenlm/util/*.hh", "kenlm/util/double-conversion/*.h"],
                 exclude = ["kenlm/*/*test.cc", "kenlm/*/*main.cc"]) +
@@ -101,11 +70,6 @@ tf_cc_shared_object(
     defines = ["KENLM_MAX_ORDER=6"],
 )

-tf_cc_shared_object(
-    name = "libdeepspeech_model.so",
-    deps = [":deepspeech_model_core"]
-)
-
 # We have a single rule including c_speech_features and kissfft here as Bazel
 # doesn't support static linking in library targets.
[arch@talloniv DeepSpeech]$ bazel build -c opt --copt=-O3 //native_client:libctc_decoder_with_kenlm.so
ERROR: The 'build' command is only supported from within a workspace.

How would I start the build and link to the system tensorflow install?

lissyx commented 6 years ago

@NicoHood This is documented in the native_client/README.md, you need to symlink from TensorFlow's source to native_client/: https://github.com/mozilla/DeepSpeech/blob/master/native_client/README.md#preparation

lissyx commented 6 years ago

@NicoHood Please note you still need to build using --config=monolithic --copt=-fvisibility=hidden for libdeepspeech.so.

NicoHood commented 6 years ago

But I only can do this at the same time when I am also building tensorflow. So I need to package tensorflow at the same time when also packaging deepspeech. I though this can be reused with an already installed tensorflow package, does not seem so. Not sure how this can be speed up then.

NicoHood commented 6 years ago

The readme does not note those options you mentioned. They also do not state the bazel version nor the tensorflow version. https://github.com/mozilla/DeepSpeech/tree/master/native_client#building

lissyx commented 6 years ago

@NicoHood No, you don't need to package tensorflow at the same time. It's all statically compiled into libdeepspeech.so

lissyx commented 6 years ago

@NicoHood As I said earlier, the Bazel versions requirements are coming from TensorFlow, not from us.

lissyx commented 6 years ago

@NicoHood The flags are properly documented: https://github.com/mozilla/DeepSpeech/tree/master/native_client#building:

bazel build -c opt --copt=-O3 //native_client:libctc_decoder_with_kenlm.so
bazel build --config=monolithic -c opt --copt=-O3 --copt=-fvisibility=hidden //native_client:libdeepspeech.so //native_client:deepspeech_utils //native_client:generate_trie
NicoHood commented 6 years ago

I also tried the latest version of deepspeech master which also fails, but this time with a runtime error:

$ ./deepspeech output_graph.pb test2.wav alphabet.txt lm.binary trie
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2018-02-26 13:52:36.973865: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Error: Alphabet size does not match loaded model: alphabet has size 1164, but model has 28 classes in its output. Make sure you're passing an alphabet file with the same size as the one used for training.
Loading the LM will be faster if you build a binary file.
Reading alphabet.txt
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
terminate called after throwing an instance of 'lm::FormatLoadException'
  what():  native_client/kenlm/lm/read_arpa.cc:65 in void lm::ReadARPACounts(util::FilePiece&, std::vector<long unsigned int>&) threw FormatLoadException.
first non-empty line was "a" not \data\. Byte: 218
Aborted (core dumped)

I am wondering why the alphabet now causes problems. Maybe you changed the format!? It seems its better to wait for the next release. Feel free to contact me before you tag a new release, I am happy to test it for Arch Linux :)

lissyx commented 6 years ago

You are passing arguments in the wrong order, wav should be the last one. You also have not setup git-lfs as documented so the language model cannot be read correctly (last error).

NicoHood commented 6 years ago

Oh what a dump mistake X_x. It now works:

$ ./deepspeech output_graph.pb alphabet.txt lm.binary trie test2.wav
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2018-02-26 14:22:52.601762: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
night for

Do I really need git lfs? I only compiled deepspeech as described in the readme, I did not train a model. I just used the model from 0.1.1.

Do you know how to get rid of the warning Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA? The processing is quite slow on my i7, I guess that is due to this missing optimization?

lissyx commented 6 years ago

@NicoHood You need git-lfs to use the git-stored model, under data/lm. If you downloaded the deepspeech-models tarball, you can just use the lm.binary and trie from your extraction.

Regarding your warning, that's because we document conservative optimizations. You have to weight the level of optimizations that you pass to TensorFlow with what users might have.

So since you passed nothing, except --copt=-O3 then nothing is available. TensorFlow r1.6 (upcoming) defaulted to up-to AVX: https://github.com/tensorflow/tensorflow/blame/26ae3287a12c71fccaec9ea74f55b6a51a3d33c6/RELEASE.md#L5

Our TaskCluster-builds are also upto AVX, by building everything with:

--copt=-mtune=generic --copt=-march=x86-64 --copt=-msse --copt=-msse2 --copt=-msse3 --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx

Just be aware that if you enable some optimizations then TensorFlow (and thus DeepSpeech) won't be able to run on a hardware that does not have them. This is a limitation from TensorFlow :/.

NicoHood commented 6 years ago

Thanks, the warnings now disappeared, the processing is still slow as it is only running in a single core. Is there any chance to do multi core processing?

The available trained model, was this created from the user data of the mozilla site or was it created by some librespeech test data? On my PC it is not working as good as on someone else setup. I am wondering if its a problem with my microphone or his/my voice. He assumed the current model relys on very clear speech, is that true?

lissyx commented 6 years ago

@NicoHood It should leverage multithreading, at least our builds do, you might have to force linking with pthread. Regarding your other questions, I would advise having a look at other issues and at discourse https://discourse.mozilla.org/c/deep-speech, because this is going a bit further than your package. It could be the mic yes, it also depends how you perform the recording: bitrate, etc. You should do mono/16kHz. We have data showing that converting from other tricks the model (likely non earable noise?).

lissyx commented 6 years ago

@NicoHood Aside, I've just closed https://github.com/mozilla/DeepSpeech/issues/1253 but I invite you to follow my suggestion in the latest comment https://github.com/mozilla/DeepSpeech/issues/1253#issuecomment-368835718 so that we can improve the process for people doing important work like yours of packaging / contributing bindings.

NicoHood commented 6 years ago

1st comment: Yes the pip installation works with multiple cores it seems. If you are building with other options than in the readme, thats not a good idea. You should at least note both of them, so the results are reproducible and people like me are not wondering about the bugs.

2nd comment: Doing more prereleases could help. This gives the devs a fixed state where they can test on, instead of a random master commit. Also it would help if you can name us some deadlines where a new version is expected.

For example I will not try any work on deepspeech until the next release, as the current version is too hard to patch and I know the next version will work better (from the git master test). However I will not test this in detail, as I dont know when the next release happens and how much is about to change. will it be within the next month or later?

So prereleases and release dates would be nice. They dont have to be that accurate, just to understand when you wish to have feedback from the maintainers.

lissyx commented 6 years ago

@NicoHood Well, all the "release" build options are properly available to anyone, though it might not be "easy" to find: https://github.com/mozilla/tensorflow/blob/r1.5/tc-vars.sh#L76-L81 https://github.com/mozilla/tensorflow/blob/r1.5/tc-vars.sh#L97-L100

Putting that in the README is risky, because it might intrigate people on erroneous / incompatible builds flags: so far, most of the people who had to rebuild were people with CPUs not supporting AVX / AVX2, so we should be careful in what we document there. It's easier to debug from the warning tensorflow outputs when your CPU supports instructions that the build does not has, rather than to debug "which optimization flag is triggering this segfault".

Regarding deadlines, we started documenting things a bit better: https://github.com/mozilla/DeepSpeech/projects but we don't have any date set, because it's hard to give anything even unaccurate so far. Would a date that is inaccurate up to weeks make sense and help?

For your other comments, there should be no reason to patch anything. Building with https://github.com/mozilla/tensorflow should work flawlessly (it does on TaskCluster). And thus, git pull on both repo should be enough. And yes, I'm sorry, there are prerequistes from TensorFlow and we cannot do anything about that. But I documented you simple way to install a different local version of Bazel to be able to build, this should work (this is what I do as well).

So, Bazel v0.5.4 + https://github.com/mozilla/tensorflow/tree/r1.4 + https://github.com/mozilla/DeepSpeech/tree/v0.1.1 and it must work :)

lissyx commented 6 years ago

For the pre-release process, can you elaborate a bit on your idea ? I'm not sure I get the proper picture of the process that would help you.

NicoHood commented 6 years ago

About the build options: Someone on AUR wrote:

also i suggest to change "export CC_OPT_FLAGS="-march=x86-64"" to "export CC_OPT_FLAGS="-march=native"" to enable ALL the optimization for your hardware

Maybe this could be best for all users, I have to test it.

About the pre-release process: Lets say you want to release v0.2.0 in about 2 weeks and most/all of the features are integrated. Then you can create a new git tag and mark it as prerelease on Github (for example 0.2.0-rc1). All maintainers can now test this fixed version and report bugs. They all test the same version, not some random commit on the master branch. This means if the same bugs occur for mutliple people it might be easier to find, and also you can compare against the next RC2 then.

lissyx commented 6 years ago

@NicoHood I second the -march=native removal. About the RC process, that might be a good idea, but it requires some work, I don't think we want to publish RC packages to pypi/npm (which is what happens right now when we do a release).

lissyx commented 6 years ago

@NicoHood Actually, I see TensorFlow rc packages on Pypi, so ... We'll have to evaluate :)

AtosNicoS commented 6 years ago

@lissyx Could you please tag a new (pre) release? It looks like the latest git version of deepspeech fixed lots of bugs and is also easier to build. It would help us to test your software faster. Thanks in advance.

lissyx commented 6 years ago

@AtosNicoS I cannot right now, we don't have the infrastructure setup for that, but you can give a try to current master. I hope to address that this week in https://github.com/mozilla/DeepSpeech/issues/1293, but I have to complete the ARM / ARM64 hardware part first and I have to debug power supply right now :)

AtosNicoS commented 6 years ago

Could you please give me a hint which tensorflow branch I need to use for building the deepspeech master? r1.6, r1.7 or master?

This is the error I get with 1.6:

Analyzing: target //native_client:libctc_decoder_with_kenlm.so (2 packages loaded)
WARNING: /build/.cache/bazel/_bazel_builduser/1f26581d0edfc50ffeb635c4dee8caad/external/protobuf_archive/WORKSPACE:1: Workspace name in /build/.cache/bazel/_bazel_builduser/1f26581d0edfc50ffeb635c4dee8caad/external/protobuf_archive/WORKSPACE (@com_google_protobuf) does not match the name given in the repository's definition (@protobuf_archive); this will cause a build error in future versions
Analyzing: target //native_client:libctc_decoder_with_kenlm.so (10 packages loaded)
Analyzing: target //native_client:libctc_decoder_with_kenlm.so (18 packages loaded)
Analyzing: target //native_client:libctc_decoder_with_kenlm.so (44 packages loaded)
ERROR: /build/.cache/bazel/_bazel_builduser/1f26581d0edfc50ffeb635c4dee8caad/external/jpeg/BUILD:126:12: Illegal ambiguous match on configurable attribute "deps" in @jpeg//:jpeg:
@jpeg//:k8
@jpeg//:armeabi-v7a
Multiple matches are not allowed unless one is unambiguously more specialized.
ERROR: Analysis of target '//native_client:libctc_decoder_with_kenlm.so' failed; build aborted: 

/build/.cache/bazel/_bazel_builduser/1f26581d0edfc50ffeb635c4dee8caad/external/jpeg/BUILD:126:12: Illegal ambiguous match on configurable attribute "deps" in @jpeg//:jpeg:
@jpeg//:k8
@jpeg//:armeabi-v7a
Multiple matches are not allowed unless one is unambiguously more specialized.
INFO: Elapsed time: 5.139s
FAILED: Build did NOT complete successfully (45 packages loaded)
lissyx commented 6 years ago

@AtosNicoS DeepSpeech master goes with TensorFlow r1.6

AtosNicoS commented 6 years ago

Is the error I am seeing possibly because of the recent arm patches? Those are new, and I am building against mozillas tensorflow r1.6 branch.

lissyx commented 6 years ago

@AtosNicoS There was no error when I replied to your message :(. Misread your reply, since you are using mozilla/tensorflow r1.6 branch, I'm pretty sure it's only because of Bazel not being v0.10.0.

lissyx commented 6 years ago

@AtosNicoS The v0.10.0 Bazel hint is documented at https://github.com/mozilla/DeepSpeech/blob/master/native_client/README.md#building, but I'm sad I missed the occasion of updating the doc to also make it clear to use mozilla/tensorflow repo :(

AtosNicoS commented 6 years ago

I am using the latest bazel 0.12.0, that is propably the problem. But I cannot simply downgrade on Arch Linux as you never know if something else will then break. Why would such a minor update break the hole build system? Is there a way to fix it?

lissyx commented 6 years ago

@AtosNicoS I have no idea, we are tied to what TensorFlow depends on. Can't you just use a local install of the proper bazel version, to avoid interfering with a system-wide installation ? You should be able to get the installer from their releases and then install it in your home or somewhere else locally. Then using the --output_user_root and --output_base flags, you should be able to ensure it also does not interfer with any running system bazel.

AtosNicoS commented 6 years ago

I was able to build it with the following PKGBUILD and patch from the archlinux tensorflow PKGBUILD:

# Maintainer: Jonas Heinrich <onny@project-insanity.org>
# Contributor: Jonas Heinrich <onny@project-insanity.org>

pkgname=deepspeech
pkgver=v0.1.1.r67.gae146d0
pkgrel=1
pkgdesc="A TensorFlow implementation of Baidu's DeepSpeech architecture"
arch=('x86_64')
url="https://github.com/mozilla/DeepSpeech"
license=('MPL2')
makedepends=('bazel' 'python-numpy' 'python-pip' 'python-wheel' 'python-setuptools' 'git')
depends=('python-tensorflow' 'python-scipy' 'sox')
source=("git+https://github.com/mozilla/deepspeech.git"
        "git+https://github.com/mozilla/tensorflow.git#branch=r1.6"
        17508.patch)
sha512sums=('SKIP'
            'SKIP'
            '18e3b22e956bdd759480d2e94212eb83d6a59381f34bbc7154cadbf7f42686c2f703cc61f81e6ebeaf1da8dc5de8472e5afc6012abb1720cadb68607fba8e8e1')

pkgver() {
  cd "$pkgname"
  git describe --long --tags | sed 's/\([^-]*-g\)/r\1/;s/-/./g'
}

prepare()
{
    patch -Np1 -i ${srcdir}/17508.patch -d tensorflow
  cd "$srcdir/tensorflow"

  # These environment variables influence the behavior of the configure call below.
  export PYTHON_BIN_PATH=/usr/bin/python
  export USE_DEFAULT_PYTHON_LIB_PATH=1
  export TF_NEED_JEMALLOC=1
  export TF_NEED_GCP=0
  export TF_NEED_HDFS=0
  export TF_NEED_S3=0
  export TF_ENABLE_XLA=1
  export TF_NEED_GDR=0
  export TF_NEED_VERBS=0
  export TF_NEED_OPENCL=0
  export TF_NEED_MPI=0
  ln -sf ../deepspeech/native_client ./
}

build() {
  cd "$srcdir/tensorflow"
  export CC_OPT_FLAGS="-march=x86-64"
  export TF_NEED_CUDA=0
  ./configure
  # bazel build -c opt --copt=-O3 \
    #   //tensorflow:libtensorflow_cc.so \
    #   //tensorflow:libtensorflow_framework.so \
    #   //native_client:deepspeech \
    #   //native_client:deepspeech_utils \
    #   //native_client:ctc_decoder_with_kenlm \
    #   //native_client:generate_trie

    bazel build -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" //native_client:libctc_decoder_with_kenlm.so
    bazel build --config=monolithic -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-fvisibility=hidden //native_client:libdeepspeech.so //native_client:deepspeech_utils //native_client:generate_trie

  cd "${srcdir}/deepspeech/native_client"
  make deepspeech
}

package() {
  cd "${srcdir}/deepspeech/native_client"
  PREFIX=${pkgdir}/usr make install
}
From 340327dc8cc637fef01e66f7dd7cae68ce259b94 Mon Sep 17 00:00:00 2001
From: Yun Peng <pcloudy@google.com>
Date: Wed, 7 Mar 2018 13:50:31 +0100
Subject: [PATCH] jpeg.BUILD: Using --cpu instead of --android_cpu

---
 third_party/jpeg/jpeg.BUILD | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/third_party/jpeg/jpeg.BUILD b/third_party/jpeg/jpeg.BUILD
index 87a23925c43..4418ac32fc4 100644
--- a/third_party/jpeg/jpeg.BUILD
+++ b/third_party/jpeg/jpeg.BUILD
@@ -526,12 +526,12 @@ config_setting(

 config_setting(
     name = "armeabi-v7a",
-    values = {"android_cpu": "armeabi-v7a"},
+    values = {"cpu": "armeabi-v7a"},
 )

 config_setting(
     name = "arm64-v8a",
-    values = {"android_cpu": "arm64-v8a"},
+    values = {"cpu": "arm64-v8a"},
 )

 config_setting(
lissyx commented 6 years ago

Right, thanks! Looks like https://github.com/tensorflow/tensorflow/commit/340327dc8cc637fef01e66f7dd7cae68ce259b94 is an actual upstream patch :)

You only build the C++ codenase, no NodeJS nor Python package ?

AtosNicoS commented 6 years ago

Not yet, as I just analyze how good the recognition works. The problem I got now is that my build only uses a single core which makes it very slow. I must be still missing something in my build.

lissyx commented 6 years ago

@AtosNicoS Try forcing -lpthread, somehow Bazel seems to have a different behavior between 0.12.0 and 0.10.0. Maybe also it's just because of the optimization levels, and threading only kicks in when you leverage SSE* and AVX stuff. You should hang out on Discourse, there are some feedback on the quality of the model, and tips on recording audio, which might save you troubles.