abdeladim-s / pywhispercpp

Python bindings for whisper.cpp
https://abdeladim-s.github.io/pywhispercpp/
MIT License
180 stars 26 forks source link

Use git describe to get version and bump actions to newest versions too #79

Closed PiotrCzapla closed 3 weeks ago

PiotrCzapla commented 1 month ago

hopefully this will work and partially address #78

PiotrCzapla commented 1 month ago

Will this solve #78 character limit issue on windows as well ?

Sure it wont. I meant it will give a workaround for windows users, they will be able to use pre complied wheels, especially once we provide versions with 'cuda' and 'vulkan' (Intel) support.

PiotrCzapla commented 1 month ago

It seems pip don't like '-' in the version. Here are few recent version from torch pip list:

torch-2.5.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
torch-2.5.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
torch-2.6.0.dev20240914+cpu-cp310-cp310-linux_x86_64.whl
torch-2.6.0.dev20240914+cpu-cp311-cp311-linux_x86_64.whl
torch-2.6.0.dev20240914+cpu-cp312-cp312-linux_x86_64.whl
torch-2.6.0.dev20240914+cpu-cp39-cp39-linux_x86_64.whl

They store git sha in the version.py, like so:

from typing import Optional

__all__ = ['__version__', 'debug', 'cuda', 'git_version', 'hip']
__version__ = '2.6.0.dev20241006'
debug = False
cuda: Optional[str] = None
git_version = 'c9a3e50fbfa49981ec176eb1845fd82b3cbe08da'
hip: Optional[str] = None

But that is for the pulished artifacts, where they set the versioning numbers through CI using environment. When wheels are build locally they use something like this: 2.6.0a0+gitcbe08da Where 2.6.0a0 comes from the version.txt file that is being checked in to the repository, and the rest is git / hg commit sha sum.

I think we could do something similar here, I will give it a try shortly.

PiotrCzapla commented 1 month ago

There is PEP 440 for versions, in a gist a0 stands for first alpha release, + denotes local version. https://peps.python.org/pep-0440/ So I think using the same version numbersas pytorch for alpha releases make sense.

PiotrCzapla commented 1 month ago

There is something wrong with wheel build after upgrade of actions.

Not sure why this errors apart, but apart of that the version code seems ok.

abdeladim-s commented 1 month ago

The build was failing because there are some issues with cbuildhweel when cross compiling to other architectures like arm. I fixed that, but the build still failing because you bumped the version of upload-artifact to v4! This is a known issue with that action and has nothing to do with the Python package version itself, but with the name of the "artifact" file. I am still wondering why we would need to bump this action to v4 if v3 was working as expected ? Do you think it's worth it ?

PiotrCzapla commented 1 month ago

I am still wondering why we would need to bump this action to v4 if v3 was working as expected ? Do you think it's worth it ?

And I forgot to answer last time, sorry. Here goes the quote from the depreciation notice: Starting November 30, 2024, GitHub Actions customers will no longer be able to use v3 of actions/upload-artifact or actions/download-artifact.

And depreciation in this case does not mean warning but a failing workflow. So customers will be forced to upgrade in a month. I've though we could fix it no along with the changes to the build.

Additional benefit of v4 are:

So it isn't that bad, and I hoped that the name of the artifact will be changed along with the version number. I will have a look what is going on.

PiotrCzapla commented 1 month ago

The build was failing because there are some issues with cbuildhweel when cross compiling to other architectures like arm.

Thank you! Do we need to cross compile, can't github run arm? I see the arm host are paid even for opensource projects. :(

PiotrCzapla commented 1 month ago

This is a known issue with that action and has nothing to do with the Python package version itself, but with the name of the "artifact" file.

The names of the artifacts contain the version right? Won't it be enough to make the upload@v4 work?

abdeladim-s commented 1 month ago

And depreciation in this case does not mean warning but a failing workflow. So customers will be forced to upgrade in a month. I've though we could fix it no along with the changes to the build.

Yeah I see .. I didn't know that .. so we have no choice.

abdeladim-s commented 1 month ago

Thank you! Do we need to cross compile, can't github run arm? I see the arm host are paid even for opensource projects. :(

yes unfortunately :( as of now it's only available for the paid plans.

abdeladim-s commented 1 month ago

The names of the artifacts contain the version right? Won't it be enough to make the upload@v4 work?

No, actually what you did will just change the version of the Python package and not the uploaded artifact name, which is the source of the issue. Here is a working workaround, but the drawback of this is that we will have an artifact for every os rather than one artifact for all the wheels, but I think we have no choice now since v3 won't be supported any more.

In this case, do you think there still any benefit of running the git subprocess to get the last commit and maintain a separate version.txt file ? If we want to use git I think git describe is better, to get the version from the tags, rather than maintaining the version in a txt file!

PiotrCzapla commented 1 month ago

In this case, do you think there still any benefit of running the git subprocess to get the last commit and maintain a separate version.txt file ? If we want to use git I think git describe is better, to get the version from the tags, rather than maintaining the version in a txt file!

That was my initial thought, but I realized it adds complexity. First, shallow builds can be problematic - if you don’t fetch enough commits, you end up without a version number. Using pip install git+http might not help here either, as I believe it also does a shallow copy. And even if pip doesn’t, other developers will inevitably run into issues tryin to speed up a checkout, or simply downloading a tar.gz with source on platforms where git is not available. (Docker?) They will stare at the code, wondering why it doesn’t compile. If they don’t report it on GitHub, it’s still wasted time for them and a source of frustration.

Maintaining a version.txt file is straightforward, and it has its advantages. For example, if you want to release version 1.3.0, you can easily do so with a one-liner like this:

(VAR=1.4.0a0; git tag "$(sed 's/a0$//' version.txt)" && echo "$VAR" > version.txt && git add version.txt && git commit -m "Bump version to $VAR" && git push && git push --tags)

This gives you both the release and an alpha release of the next version, which is nice. Achieving the same with tags alone is less clean and not significantly simpler.

abdeladim-s commented 1 month ago

Yes, true, that's my thought as well when I asked. Running a git subprocess will just add complexity without any significant benefit. In this case, let's keep the txt file for the version, and thanks for the command.

Should we close this PR then, or do you want to modify it to remove the git subprocess from the get_version function?

PiotrCzapla commented 4 weeks ago

Should we close this PR then, or do you want to modify it to remove the git subprocess from the get_version function?

The current command works with the shallow checkouts, but not with tar.gz, I've fixed that so the local version is empty when git fails. I like the '+git' as it shows clearly what is being installed. Here is an output from uv

> uv pip install .
Using Python 3.11.10 environment at /opt/homebrew/Caskroom/mambaforge/base/envs/whisper
Resolved 9 packages in 1.31s
   Built pywhispercpp @ file:///workspace/pywhispercpp
Prepared 1 package in 2.99s
Uninstalled 1 package in 3ms
Installed 1 package in 3ms
 - pywhispercpp==1.3.0a0+git116106b (from file:///workspace/pywhispercpp)
 + pywhispercpp==1.3.0a0+git6ac5411 (from file:///workspace/pywhispercpp)
PiotrCzapla commented 4 weeks ago

Crosscompiling to arm64 is still failing, I'm not sure how to turn it off.

abdeladim-s commented 4 weeks ago

No worries, I will fix it. Would be great if you can rebase the branch to include only the commit for the version logic only, to clean a bit the commits history. You can also exclude bumping the github actions commit, I will take care of that.

abdeladim-s commented 3 weeks ago

Yes, it works on Linux too. Amazing :) I wonder if there is a way to copy the dlls for windows as well!

I'll merge this one for now and I'll then fix the CI. Thanks a lot @PiotrCzapla!

abdeladim-s commented 3 weeks ago

BTW, what happened to the repair wheel logic ? The CI tests seem to be failing! I check the Linux wheels artifacts and the libwhisper.so is no longer included in the lib folder ? Have you changed something in the last PR ?

PiotrCzapla commented 2 weeks ago

It works on macOS, so I've missed that, sorry. The change of @rpath to @loaderpath makes it installable in place, but it also makes the repairwheel a no op on macos. And it somehow breaks with on Ubuntu where we wheels start to grab lots of system libraries (libc, gcc etc.) I will send a pull request that fixes that shortly.