alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.43k stars 1.04k forks source link

Plan for the next release #24

Open nshmyrev opened 4 years ago

nshmyrev commented 4 years ago

Help is very much needed on all this, any contribution will be appreciated.

dtreskunov commented 4 years ago

I can probably help with building for more platforms, including Windows and different flavors of ARM.

One approach is to use cross-compiler images from dockcross.

Let me know if you would consider merging a PR using GitHub actions to automate building/uploading Python wheels.

nshmyrev commented 4 years ago

Let me know if you would consider merging a PR using GitHub actions to automate building/uploading Python wheels.

I checked github actions and didn't find critical advantages over travis. You can try if you want of course.

dtreskunov commented 4 years ago

I made some progress on getting a Windows build out. I'm currently blocked on some errors when compiling OpenFST (opened this issue: https://github.com/kkm000/openfst/issues/25).

This is much more complicated than expected :)

nshmyrev commented 4 years ago

@dtreskunov Thank you. For Win build I really consider mingw with anaconda, that will be much easier than stock python with VS. I think anaconda build should be enough. We don't have to spend much time on it.

nshmyrev commented 4 years ago

Also, as far as I know @kkm000 uses very recent openfst, you might have better chances with openfst-1.6.7 instead of 1.7 something he is using.

dtreskunov commented 4 years ago

Thanks. I really think I'm very close to getting it working. I got the Python wheels compiled with VS 2017, however, when doing import vosk, I was getting an ImportError saying DLL could not be loaded. That's when I found out that VS 2015 must be used to build Python extensions. However, it won't build OpenFST :)

I'll try 1.6.7 when I get a chance.

On Sun, Feb 16, 2020, 12:37 Nickolay V. Shmyrev notifications@github.com wrote:

Also, as far as I know @kkm000 https://github.com/kkm000 uses very recent openfst, you might have better chances with openfst-1.6.7 instead of 1.7 something he is using.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alphacep/vosk-api/issues/24?email_source=notifications&email_token=AA2RDSH5ZBALEOQVT3JTXG3RDGPX5A5CNFSM4KNK5IRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL4RQAA#issuecomment-586749952, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2RDSEL7JNXXQHGIOEET4LRDGPX5ANCNFSM4KNK5IRA .

nshmyrev commented 4 years ago

@dtreskunov ah, I see. Given openfst uses C++11, no chance we can make it working. VS14 simply doesn't support c++11 well enough.

https://docs.microsoft.com/en-us/cpp/overview/visual-cpp-language-conformance?view=vs-2019

Consider mingw and anaconda.

dtreskunov commented 4 years ago

I was able to fix it thanks to this article. It turns out that the ImportError was getting thrown because the generated _vosk.pyd depended on libopenblas.dll. This despite having provided libopenblas.lib to the linker. By putting libopenblas.dll and its dependencies into the .whl, I was able to get it to work.

The resulting .whl installs and works fine in my plain Python environment: no anaconda, just regular Windows x86-64 executable installer.

I would appreciate it if you could test it: vosk-0.3.1_dtreskunov_183_ga8b2c22-cp38-cp38-win_amd64.whl.zip

I'm not sure why it works. It's built using Visual Studio 15 2017, despite the requirement to use Visual Studio 14 2015.

I'm going to clean up the scripts a bit and will send a PR soon.

bedapudi6788 commented 4 years ago

@nshmyrev I can do Implement more pythonic API, Automatic download of the model and Basic unit tests if you are still looking for help.

nshmyrev commented 4 years ago

@bedapudi6788 that would be great!

bedapudi6788 commented 4 years ago

@nshmyrev this is my idea of "pythonic" api.

import vosk

# print available models
print(vosk.list_models())

# auto downloads the model if not found in local
asr = vosk.load("en-us")

# word_list is optional
# if stream=True, will return iterator for partial results
# if stream=False will return final result
# default stream = False
result = asr.recognize('wav_file_path', word_list, stream=True)

let me know if this is ok or any changes required.

nshmyrev commented 4 years ago

@bedapudi6788 I have created a separate issue about this: https://github.com/alphacep/vosk-api/issues/31, please check