mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Mozilla Public License 2.0
25.28k stars 3.96k forks source link

Cannot install Deepspeech using current instructions for Python 3.6.6 on Windows 10, possibly due to, "it is not compatible with this Python" #1521

Closed the-nose-knows closed 6 years ago

the-nose-knows commented 6 years ago

Current instructions are either incomplete or incorrect with the Windows Desktop environment; from the error output, it doesn't appear Windows is even supported due to a lack of windows WHLs, which would have been nice to know before I spent all this time trying to get my installation up and running. If this is the case, please please please explicitly document that this is not available on Windows in a conspicuous location, like the main page on GitHub.

I installed Python 3.6.6 with all installation options enabled via the Custom Installation options. I couldn't install DeepSpeech so I updated pip from version 10.x to version 18.0

I was able to install virtualenv without issues, as well as start it up. However I still could not download deepspeech with pip, so I tried it again using increased verbosity. You can see the full text output of version-queries and the verbose request to install it at my gist here: https://gist.github.com/the-nose-knows/d06d6ad580125e8125ae994f7d12fbb3

lissyx commented 6 years ago

We never ever mentioned Windows support while we explicitely document macOS and Linux here: https://github.com/mozilla/DeepSpeech/blob/master/README.md

The release pages also explicitely list the supported platforms: https://github.com/mozilla/DeepSpeech/releases/tag/v0.1.1

lissyx commented 6 years ago

@the-nose-knows Also, some people on discourse reported being able to setup through the windows linux subsystem, if that can be of any help, yet it's very unlikely you can expect training to work on that kind of setup.

lissyx commented 6 years ago

@the-nose-knows Maybe you could send a PR expliciting that in the documentation? Obviously, it was obvious to us that Windows was not being supported so far, and thus we don't see where it's really important to state it for people.

lissyx commented 6 years ago

There's also issue #1123 open in case anyone might be interested in starting hacking on that. Back then, TensorFlow with Windows was not really a good thing, and our current focus is not yet really on that platform :)

the-nose-knows commented 6 years ago

@lissyx I don't see anywhere on the pages you linked to (correction: I see it on the Release) that mention the list of supported platforms. I see platforms that get mentioned for certain use-cases/scenarios, but that's it. I wouldn't assume a lack of Windows support for anything unless I was neck-deep in *nix territory already. That wasn't the case when I ran across DeepSpeech.

update 1: I did a documentation PR to make it more conspicuous. If someone doesn't see that... well... tough luck, skimmer? Lol

update 2: I've installed deepspeech via the Ubuntu Windows Linux Subsystem available on the Windows Store. I haven't used deepspeech with it yet, but I did finally get it installed. Due to my work's proxy, it did require updating /etc/apt/apt.conf.d/proxy.conf for apt-get and /etc/environment for http_proxy and https_proxy env-vars.

the-nose-knows commented 6 years ago

Final Update: On Windows 10, I got deepspeech working on the Ubuntu distro available as a Windows Linux Subsystem on the Microsoft Store.

My primary points of friction was just setting all the env-vars needed for stuff, as it had been a couple years since I even touched a linux distro. First I needed to configure the proxy for apt-get, then the general env-vars used by many *nix apps, ie http_proxy and https_proxy

After that I was able to follow the existing documentation and got it working against one of the smoke_test files :)

username@DESKTOP-81FC02N:~/DeepSpeech$ deepspeech models/output_graph.pb data/smoke_test/LDC93S1.wav models/alphabet.tx
t models/lm.binary models/trie
Loading model from file models/output_graph.pb
2018-09-06 19:03:48.804771: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Loaded model in 0.553s.
Loading language model from files models/lm.binary models/trie
Loaded language model in 0.948s.
Running inference.
she had a ducsuotangresywathorerall year
Inference took 6.056s for 2.925s audio file.
tosbanzai@DESKTOP-81FC02N:~/DeepSpeech$

When listening to that smoketest file, I considered the fact that it got any of the words correctly a bonus. It wasn't very clear, so good job deepspeech! ^^

lissyx commented 6 years ago

Closed by #1523

lock[bot] commented 5 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.