mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Mozilla Public License 2.0
25.41k stars 3.98k forks source link

Add CTC decoder builds for more Python versions #3596

Open lissyx opened 3 years ago

lissyx commented 3 years ago

We hard-coded the action call with one version https://github.com/mozilla/DeepSpeech/blob/5f566f442541d76f320ee081294cdef980c361d1/.github/workflows/macOS-amd64.yml#L54-L86 it would be nice to:

lissyx commented 3 years ago

@Ayushsunny This might be easier, if you are still willing to learn GitHub Actions at the same time

Ayushsunny commented 3 years ago

@Ayushsunny This might be easier, if you are still willing to learn GitHub Actions at the same time

Thank you so much as you want me to learn. It's so nice of you Sir @lissyx . and yes I want to learn, So what I have to do first is to understand this specific topic to fix this issue

lissyx commented 3 years ago

So what I have to do first to understand this specific topic to fix this issue

Read this code, get yourself into https://docs.github.com/en/actions/guides :)

lissyx commented 3 years ago

@Ayushsunny Have you been able to get a grasp of what needs to be done ?

lissyx commented 3 years ago

@Ayushsunny Ping?

Ayushsunny commented 3 years ago

I'm really sorry Sir @lissyx I was studying for my University Exam. Now I'm free and from tomorrow I'll start working on it.

lissyx commented 3 years ago

I'm really sorry Sir @lissyx I was studying for my University Exam. Now I'm free and from tomorrow I'll start working on it.

Thanks, I don't want to pressure you, it's fine if you can't take it, but we just need to know :)

Ayushsunny commented 3 years ago

So Sir, as per my understanding I think this should be the code which I have to add. @lissyx ?


jobs:
  build:
    runs-on: macos-10.15
    strategy:
      matrix:
        python-version: [ '2.x', '3.x', 'pypy-2.7', 'pypy-3.6', 'pypy-3.7' ]
    name: Python ${{ matrix.python-version }} sample
    steps:
      - uses: actions/checkout@v2
      - name: Setup python
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python-version }}
          architecture: x64
      - run: python my_script.py `
lissyx commented 3 years ago

@Ayushsunny Sorry looks like my reply got lost ; that's the good basis, you'll have to build on top of that.

Ayushsunny commented 3 years ago

@Ayushsunny Sorry looks like my reply got lost ; that's the good basis, you'll have to build on top of that.

Which reply Sir? @lissyx and, may make a PR for this? or I need to improve the last code which I've sent .

lissyx commented 3 years ago

Which reply Sir? @lissyx

One I thought I had sent

and, may make a PR for this? or I need to improve the last code which I've sent .

What you shared is not what we need, but it's on the right path.

We need a job like the one I shared earlier, but on more versions of python.

Ayushsunny commented 3 years ago

Which reply Sir? @lissyx

One I thought I had sent

and, may make a PR for this? or I need to improve the last code which I've sent .

What you shared is not what we need, but it's on the right path.

We need a job like the one I shared earlier, but on more versions of python.

Okay @lissyx Sir I got it do we need like this right ?

  build-ctc-decoder: 
     name: "Build CTC decoder Python package for testing" 
     needs: [ swig_macOS ] 
     runs-on: macos-10.15 
     strategy:
      matrix:
        python-version: [3.6.8, 3.7.9, 3.8.8, 3.9.2]
     if: ${{ github.event_name == 'pull_request' }} 
     steps: 
       - uses: actions/checkout@v2 
         with: 
           fetch-depth: 0 
       - uses: ./.github/actions/install-python-upstream 
         with: 
           version:  ${{ matrix.python-version }}
       - run: | 
           python --version 
           pip --version 
       - uses: actions/download-artifact@v2 
         with: 
           name: "swig_macOS" 
           path: ${{ github.workspace }}/native_client/ds-swig/ 
       - run: | 
           ls -hal ${{ github.workspace }}/native_client/ds-swig/bin 
           ln -s ds-swig ${{ github.workspace }}/native_client/ds-swig/bin/swig 
           chmod +x ${{ github.workspace }}/native_client/ds-swig/bin/ds-swig ${{ github.workspace }}/native_client/ds-swig/bin/swig 
       - run: | 
           make -C native_client/ctcdecode/ \ 
             NUM_PROCESSES=$(sysctl hw.ncpu |cut -d' ' -f2) \ 
             bindings 
       - uses: actions/upload-artifact@v2 
         with: 
           name: "ds_ctcdecoder-test.whl" 
           path: ${{ github.workspace }}/native_client/ctcdecode/dist/*.whl 
       - run: | 
           make -C native_client/ctcdecode clean-keep-third-party 
Ayushsunny commented 3 years ago

But sir How I am supposed to fix this if I am not able to find this /macOS-amd64.yml file in master branch for more knowledge I am sticking the screenshots here :

Not available in the master branch -

Screenshot from 2021-04-23 12-01-37

see this is available in your branch -

Screenshot from 2021-04-23 12-01-58

dijksterhuis commented 3 years ago

@Ayushsunny looks like the original file has been renamed to build-and-test.yml.

I think this is the section you want to be looking at (it looks like what was originally posted by @lissyx after a very brief inspection).

Ayushsunny commented 3 years ago

Yes @dijksterhuis, Thank you so much. @lissyx Can I make a PR with matrix for all versions here

lissyx commented 3 years ago

Yes @dijksterhuis, Thank you so much. @lissyx Can I make a PR with matrix for all versions here

Yes. The whole idea of our switching to Github Actions is to allow people to run CI easily on their account as well:

Tada, things runs :). Experiment, ask us help and we can move forward.

Ayushsunny commented 3 years ago

Hello @lissyx Sir I have made a PR as a Demo Which I am going to add matrix version in every CTC decoders. could you please review and guide me that I am going right or not please? https://github.com/mozilla/DeepSpeech/pull/3640

lissyx commented 3 years ago

Hello @lissyx Sir I have made a PR as a Demo Which I am going to add matrix version in every CTC decoders. could you please review and guide me that I am going right or not please?

3640

I see you opened the PR against our repo ; if you open it against your repo, then you can run GitHub Actions without our approval and debug on your own side ;)

lissyx commented 3 years ago

@Ayushsunny That's a good start, you might want to add / experiment more test coverage on all those versions now

Ayushsunny commented 3 years ago

@Ayushsunny That's a good start, you might want to add / experiment more test coverage on all those versions now

Yes Sir, Thank you so much for your review

Ayushsunny commented 3 years ago

Sir I have also made a PR against my repo and I think I have updated all what we need https://github.com/Ayushsunny/DeepSpeech/pull/2

Ayushsunny commented 3 years ago

Sir @lissyx ?

lissyx commented 3 years ago

Sir @lissyx ?

It looks good but I don't see any CI run ?

Ayushsunny commented 3 years ago

Sir @lissyx ?

It looks good but I don't see any CI run ?

Sir, Can you approve the PR which was made against the Original Repo. then I think it will run

lissyx commented 3 years ago

Sir @lissyx ?

It looks good but I don't see any CI run ?

Sir, Can you approve the PR which was made against the Original Repo. then I think it will run

No, it is already NOT working on your own repo, please have a look at your Actions page, you have errors to fix.

Ayushsunny commented 3 years ago

Sir @lissyx ?

It looks good but I don't see any CI run ?

Sir, Can you approve the PR which was made against the Original Repo. then I think it will run

No, it is already NOT working on your own repo, please have a look at your Actions page, you have errors to fix.

Sir, Everything is looking fine my side. I have added python version with matrix and changed versions of python that's it.

lissyx commented 3 years ago

Sir @lissyx ?

It looks good but I don't see any CI run ?

Sir, Can you approve the PR which was made against the Original Repo. then I think it will run

No, it is already NOT working on your own repo, please have a look at your Actions page, you have errors to fix.

Sir, Everything is looking fine my side. I have added python version with matrix and changed versions of python that's it.

No, it is not:

Ayushsunny commented 3 years ago

Sir @lissyx ?

It looks good but I don't see any CI run ?

Sir, Can you approve the PR which was made against the Original Repo. then I think it will run

No, it is already NOT working on your own repo, please have a look at your Actions page, you have errors to fix.

Sir, Everything is looking fine my side. I have added python version with matrix and changed versions of python that's it.

No, it is not:

* [Ayushsunny#2](https://github.com/Ayushsunny/DeepSpeech/pull/2) does not show any build and tests running

* https://github.com/Ayushsunny/DeepSpeech/actions/runs/793109099 shows failure: ` The workflow is not valid. .github/workflows/build-and-test.yml (Line: 169, Col: 27): A sequence was not expected .github/workflows/build-and-test.yml (Line: 710, Col: 27): A sequence was not expected`

Sir, even after updating as you've said, it's again throwing error something like that installer: Error - the package path specified was invalid: 'python.pkg'. here

lissyx commented 3 years ago

Sir @lissyx ?

It looks good but I don't see any CI run ?

Sir, Can you approve the PR which was made against the Original Repo. then I think it will run

No, it is already NOT working on your own repo, please have a look at your Actions page, you have errors to fix.

Sir, Everything is looking fine my side. I have added python version with matrix and changed versions of python that's it.

No, it is not:

* [Ayushsunny#2](https://github.com/Ayushsunny/DeepSpeech/pull/2) does not show any build and tests running

* https://github.com/Ayushsunny/DeepSpeech/actions/runs/793109099 shows failure: ` The workflow is not valid. .github/workflows/build-and-test.yml (Line: 169, Col: 27): A sequence was not expected .github/workflows/build-and-test.yml (Line: 710, Col: 27): A sequence was not expected`

Sir, even after updating as you've said, it's again throwing error something like that installer: Error - the package path specified was invalid: 'python.pkg'. here

This is specific to macOS that does not use the same action. You need three-digits versions number for it.

Ayushsunny commented 3 years ago

Sir @lissyx ?

It looks good but I don't see any CI run ?

Sir, Can you approve the PR which was made against the Original Repo. then I think it will run

No, it is already NOT working on your own repo, please have a look at your Actions page, you have errors to fix.

Sir, Everything is looking fine my side. I have added python version with matrix and changed versions of python that's it.

No, it is not:

* [Ayushsunny#2](https://github.com/Ayushsunny/DeepSpeech/pull/2) does not show any build and tests running

* https://github.com/Ayushsunny/DeepSpeech/actions/runs/793109099 shows failure: ` The workflow is not valid. .github/workflows/build-and-test.yml (Line: 169, Col: 27): A sequence was not expected .github/workflows/build-and-test.yml (Line: 710, Col: 27): A sequence was not expected`

Sir, even after updating as you've said, it's again throwing error something like that installer: Error - the package path specified was invalid: 'python.pkg'. here

This is specific to macOS that does not use the same action. You need three-digits versions number for it.

Hello Sir, I'm bothering you too much today see again it throws some new error which was not added by me https://github.com/Ayushsunny/DeepSpeech/runs/2492609550?check_suite_focus=true

lissyx commented 3 years ago

Sir @lissyx ?

It looks good but I don't see any CI run ?

Sir, Can you approve the PR which was made against the Original Repo. then I think it will run

No, it is already NOT working on your own repo, please have a look at your Actions page, you have errors to fix.

Sir, Everything is looking fine my side. I have added python version with matrix and changed versions of python that's it.

No, it is not:

* [Ayushsunny#2](https://github.com/Ayushsunny/DeepSpeech/pull/2) does not show any build and tests running

* https://github.com/Ayushsunny/DeepSpeech/actions/runs/793109099 shows failure: ` The workflow is not valid. .github/workflows/build-and-test.yml (Line: 169, Col: 27): A sequence was not expected .github/workflows/build-and-test.yml (Line: 710, Col: 27): A sequence was not expected`

Sir, even after updating as you've said, it's again throwing error something like that installer: Error - the package path specified was invalid: 'python.pkg'. here

This is specific to macOS that does not use the same action. You need three-digits versions number for it.

Hello Sir, I'm bothering you too much today see again it throws some new error which was not added by me https://github.com/Ayushsunny/DeepSpeech/runs/2492609550?check_suite_focus=true

it is failing because trying to rebuild numpy

Please investigate what versions of numpy provides a wheel on pypi that works for python 3.8 and then update the env variables NUMPY_BUILD_VERSION / NUMPY_DEP_VERSION in https://github.com/mozilla/DeepSpeech/blob/master/.github/actions/numpy_vers/action.yml

Ayushsunny commented 3 years ago

Sir @lissyx ?

It looks good but I don't see any CI run ?

Sir, Can you approve the PR which was made against the Original Repo. then I think it will run

No, it is already NOT working on your own repo, please have a look at your Actions page, you have errors to fix.

Sir, Everything is looking fine my side. I have added python version with matrix and changed versions of python that's it.

No, it is not:

* [Ayushsunny#2](https://github.com/Ayushsunny/DeepSpeech/pull/2) does not show any build and tests running

* https://github.com/Ayushsunny/DeepSpeech/actions/runs/793109099 shows failure: ` The workflow is not valid. .github/workflows/build-and-test.yml (Line: 169, Col: 27): A sequence was not expected .github/workflows/build-and-test.yml (Line: 710, Col: 27): A sequence was not expected`

Sir, even after updating as you've said, it's again throwing error something like that installer: Error - the package path specified was invalid: 'python.pkg'. here

This is specific to macOS that does not use the same action. You need three-digits versions number for it.

Hello Sir, I'm bothering you too much today see again it throws some new error which was not added by me https://github.com/Ayushsunny/DeepSpeech/runs/2492609550?check_suite_focus=true

it is failing because trying to rebuild numpy

Please investigate what versions of numpy provides a wheel on pypi that works for python 3.8 and then update the env variables NUMPY_BUILD_VERSION / NUMPY_DEP_VERSION in https://github.com/mozilla/DeepSpeech/blob/master/.github/actions/numpy_vers/action.yml

Sir, I have tried to find out every solution for this but I am not able to.

are we going to add this version ?? Screenshot from 2021-05-04 18-08-20

lissyx commented 3 years ago

@Ayushsunny Do you understand the problems stated in https://github.com/mozilla/DeepSpeech/blob/master/.github/actions/numpy_vers/README.md and how we solve them ?

Ayushsunny commented 3 years ago

@lissyx , Yes Sir that we have to avoid having to rebuild numpy so we stick to versions where there is an existing upstream wheel file.

lissyx commented 3 years ago

Hello @lissyx Sir are you free to guide me now for this ? Please

I have already explained to you, I can't explain more if you don't tell precisely what you don't understand. You need to find the existing wheels for that platform that are ABI compatible and available for the matching python version

hmen97 commented 3 years ago

@lissyx I went through this thread, spent some time looking at github actions, came up with this pull request hmen97#2.

CTC decoder has been built for

The test python bindings step for each platform is running into an error or being skipped, logs below:

Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple, https://lissyx.github.io/deepspeech-python-wheels/ Processing /home/runner/work/DeepSpeech/DeepSpeech/tmp/deepspeech_tflite-0.10.0a3-cp37-cp37m-linux_aarch64.whl Collecting numpy>=1.19.3 (from deepspeech-tflite==0.10.0a3) Could not find a version that satisfies the requirement numpy>=1.19.3 (from deepspeech-tflite==0.10.0a3) (from versions: 1.17.0) No matching distribution found for numpy>=1.19.3 (from deepspeech-tflite==0.10.0a3)

Will it be solved if I add the required prebuilt numpy wheel from pypi(for each python version) in this project or does some other change have to be made while saving artifacts?

lissyx commented 3 years ago

@lissyx I went through this thread, spent some time looking at github actions, came up with this pull request hmen97#2.

This change does not look super good at a first glance, you bump the minimal versions on many linux versions for example, this is going to break backward compatibility

CTC decoder has been built for

* [3.6, 3.7, 3.8, 3.9] build-ctc-decoder-Linux

* [3.6.8, 3.7.9, 3.8.8, 3.9.2] build-ctc-decoder-Windows

* [3.6.8, 3.7.9, 3.8.8, 3.9.2] build-ctc-decoder-macos

The test python bindings step for each platform is running into an error or being skipped, logs below:

Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple, https://lissyx.github.io/deepspeech-python-wheels/ Processing /home/runner/work/DeepSpeech/DeepSpeech/tmp/deepspeech_tflite-0.10.0a3-cp37-cp37m-linux_aarch64.whl Collecting numpy>=1.19.3 (from deepspeech-tflite==0.10.0a3) Could not find a version that satisfies the requirement numpy>=1.19.3 (from deepspeech-tflite==0.10.0a3) (from versions: 1.17.0) No matching distribution found for numpy>=1.19.3 (from deepspeech-tflite==0.10.0a3)

I can't give any definitive answer only from that line, I need to know how you repro that ...

Will it be solved if I add the required prebuilt numpy wheel from pypi(for each python version) in this project or does some other change have to be made while saving artifacts?

Those prebuilt wheels were for Aarch64 only. Without context, I have no idea what you are doing.

hmen97 commented 3 years ago

This change does not look super good at a first glance, you bump the minimal versions on many linux versions for example, this is going to break backward compatibility

Used this repo for minimum recommended version of numpy. I can run more tests to find the minimum stable version for each platform&py-version combination.

I can't give any definitive answer only from that line, I need to know how you repro that ...

I just changed the numpy versions in .github/actions/numpy_vers/action.yml and in github/workflows/build-and-test.yml I added python versions in the build ctc decoder matrix strategy. On creating a pull request to my forked master branch, the checks are triggered. Basically was checking how everything works, or rather where and why it breaks for different versions.

Those prebuilt wheels were for Aarch64 only. Without context, I have no idea what you are doing

My bad the wheels are already present here, when I checked it was down yesterday probably because of the outage. But let's say we need a lower version of aarch64 numpy wheel, do we stick to what numpy has wheels for(1.19.2 onwards) or build our own like you did

lissyx commented 3 years ago

This change does not look super good at a first glance, you bump the minimal versions on many linux versions for example, this is going to break backward compatibility

Used this repo for minimum recommended version of numpy. I can run more tests to find the minimum stable version for each platform&py-version combination.

nice repo, is this official? it seems to only focus on arm/aarch64 though

it still does not explain why you push everything with a bare minimum of 1.19.x on linux for example

I can't give any definitive answer only from that line, I need to know how you repro that ...

I just changed the numpy versions in .github/actions/numpy_vers/action.yml and in github/workflows/build-and-test.yml I added python versions in the build ctc decoder matrix strategy. On creating a pull request to my forked master branch, the checks are triggered. Basically was checking how everything works, or rather where and why it breaks for different versions.

Please give me a link to the errors ?

Those prebuilt wheels were for Aarch64 only. Without context, I have no idea what you are doing

My bad the wheels are already present here, when I checked it was down yesterday probably because of the outage. But let's say we need a lower version of aarch64 numpy wheel, do we stick to what numpy has wheels for(1.19.2 onwards) or build our own like you did

aarch64 wheels is complicated, the packages built are there for automation purpose only ; it is up to the users to handle the situation, because pypi does not accept aarch64 wheels for linux

lissyx commented 3 years ago

So: https://github.com/hmen97/DeepSpeech/runs/2781014601?check_suite_focus=true#step:10:152

This is exactly what I was telling, you bumped linux minimal deps with no justifications, so the packages on https://github.com/lissyx/deepspeech-python-wheels built only for CI are not met.

lissyx commented 3 years ago

I can run more tests to find the minimum stable version for each platform&py-version combination.

At some point, I think what should be done is using this to replace the hard-coded list as much as possible, instead. And have ways to reproduce what you did ; because so far, I only see versions being changed with nothing to justify, and breaking backward compat.

lissyx commented 3 years ago

aarch64 wheels is complicated, the packages built are there for automation purpose only ; it is up to the users to handle the situation, because pypi does not accept aarch64 wheels for linux

Looks like now PyPi accepts aarch64 for wheels on linux. If you feel up to it, please:

hmen97 commented 3 years ago

This is exactly what I was telling, you bumped linux minimal deps with no justifications, so the packages on https://github.com/lissyx/deepspeech-python-wheels built only for CI are not met

Oh yeah, I just bumped that up to the min numpy version from the versions in specific platform & instruction set combination. Will revert that to what it was before.

Looks like now PyPi accepts aarch64 for wheels on linux. If you feel up to it, please file an issue to fix that add proper manylinux support to ensure we meet the requirements

Yeah I can do that, but I just need one clarification, the host system is a ubuntu 20.04 x86_64 , the arm7 and aarch64 are mounts. When building python bindings I can only see is the numpy build and dep version being passed as params, how does the right wheel get selected in case of varying architectures?

lissyx commented 3 years ago

Yeah I can do that, but I just need one clarification, the host system is a ubuntu 20.04 x86_64 , the arm7 and aarch64 are mounts. When building python bindings I can only see is the numpy build and dep version being passed as params, how does the right wheel get selected in case of varying architectures?

I'm not sure I get your question. Python wheel are cross-compiled, as libdeepspeech is

lissyx commented 3 years ago

it is built here for linux/armv7 for example: https://github.com/mozilla/DeepSpeech/blob/9e67724d39ff89f191f7052e653aaddac7c90b19/.github/workflows/build-and-test.yml#L2280-L2353

lissyx commented 3 years ago

And specifics are handled in https://github.com/mozilla/DeepSpeech/blob/9e67724d39ff89f191f7052e653aaddac7c90b19/.github/workflows/build-and-test.yml#L2343-L2349

lissyx commented 3 years ago

The action is here https://github.com/mozilla/DeepSpeech/blob/master/.github/actions/python-build/action.yml and it relies on makefile level abstraction of the cross-compilation of the bindings

lissyx commented 3 years ago

The build is done here https://github.com/mozilla/DeepSpeech/blob/master/native_client/python/Makefile#L13 and you can see we depend on a lot of env variables

they are being set in https://github.com/mozilla/DeepSpeech/blob/master/native_client/definitions.mk and for example you can see the ones for linux/armv7 in https://github.com/mozilla/DeepSpeech/blob/master/native_client/definitions.mk#L72-L92

lissyx commented 3 years ago

The build is done here https://github.com/mozilla/DeepSpeech/blob/master/native_client/python/Makefile#L13 and you can see we depend on a lot of env variables

they are being set in https://github.com/mozilla/DeepSpeech/blob/master/native_client/definitions.mk and for example you can see the ones for linux/armv7 in https://github.com/mozilla/DeepSpeech/blob/master/native_client/definitions.mk#L72-L92

And as you can see, this is where we set some PYTHON_ variables to properly cross-compile:

hmen97 commented 3 years ago

they are being set in https://github.com/mozilla/DeepSpeech/blob/master/native_client/definitions.mk and for example you can see the ones for linux/armv7 in https://github.com/mozilla/DeepSpeech/blob/master/native_client/definitions.mk#L72-L92

I followed the calls all the way to the native_client/python/Makefile , didn't notice the definitions.mk. Ok so I think I get it now, correct me if I'm wrong, you set the architecture to linux_aarch64 here but since PyPi didn't have any numpy wheel for aarch64, it gets the wheel from here only supporting 1.14.2 or 1.17.0 for linux aarch64 which also explains this

So based on the os and architecture, if I create another case for Linux:aarch64 here, and change linux_aarch64 to manylinux2014_aarch64, we will be using the PyPi supported wheels for numpy