rocky / pycdio

Python interface to the libcdio - the CD Input and Control library
GNU General Public License v3.0
15 stars 8 forks source link

Provide Docker image for building pycdio wheels #7

Closed arcctgx closed 5 months ago

arcctgx commented 6 months ago

This PR adds a Dockerfile and some auxiliary files for building a Docker image which in turn can be used for building manylinux2014_x86_64 wheels that can be uploaded to PyPI.

I chose manylinux2014 as base image, because it seems to be the most commonly used image in https://quay.io/organization/pypa. I'm using the latest tagged version available at this time.

libcdio is installed from source, because the distribution package only provides libcdio-0.92 which is too old. I'm disabling most of the tools provided by libcdio because they are not relevant for building pycdio wheels.

The container will produce wheels for CPython versions 3.6 to 3.12 and for PyPy 3.7 to 3.10. This set is defined by the list of interpreters provided in the image, but it can be reduced if needed.

There are a few open points:

  1. The solution is x86_64 and glibc specific. I'm not sure how to elegantly generalize it to other architectures or platforms (e.g. for i686 or for musllinux). That would likely require separate Docker images and perhaps a dedicated build script as well. On the other hand, nowadays x86_64/glibc combo will probably cover the majority of real life use cases.
  2. I disabled CDDB support and C++ bindings in libcdio when building from source. I'm not sure if they're relevant to pycdio or not (I don't think they are, but I may be wrong). Both can be enabled if they're important.
  3. I did not figure out yet how to run automated tests in the container for each created wheel. I guess that would be nice to have, but I hope it's not a blocker.
  4. The container image should be versioned somehow, but I'm not sure what the best practices are. Perhaps a simple VERSION file inside admin-tools/pycdio-builder would do?

All in all, maybe it's not perfect, but it's a starting point nonetheless.

rocky commented 5 months ago

Hi - today I looked at this PR and tried it locally. Building the docker image went easy enough but it took me a while to understand which directory be in when running docker run and what the purpose of /package and wheelhouse are.(since the script and docker file are under admin-tools.

Overall think this should be in its own separate project managed by you. Overall it feels this is going to be a bit of overhead in maintenance and the setup seems fragile. Docker containers change, python versions changes. Also, this feels more like a packaging issue than a source code issue.

If you set this up as its own project, I will promote it as what to use for these kinds of wheels.. You can include pycdio as a git submodule.

And then when I make a release, as a separate step, I'll switch to this project and run some code and then just add the wheels to the existing distribution.

Your thoughts?

arcctgx commented 5 months ago

Thanks for finding the time to review this. :slightly_smiling_face:

I understand you may not want to maintain the Dockerfile and the script. It is a non-trivial thing, and as you said it's definitely a packaging issue. I could set it up as a separate project as you suggest, so that you could use it when you want to build wheels. I'm not sure that it follows the usual practice, though. I have to give it some thought.

But it also just occurred to me that there might be a different solution that does not involve Docker. You already have a script make-dist.sh that builds wheels using pyenv. The process of building manylinux wheels consists of building a linux_x86_64 wheel, which the script already does, and then converting it to manylinux wheel using auditwheel tool. So maybe it would be simpler to add auditwheel step to make-dist.sh?

rocky commented 5 months ago

So maybe it would be simpler to add auditwheel step to make-dist.sh?

For me not simpler. Simplest is not to add anything.

Right now, you advocate for this, and you are in a better position to keep this maintained.

By separating this into its own project, you can change the packaging using docker or a shell script, or TOML or whatever mechanism you want to use and change it as often as you want.

Yes, this project has some packaging code, but just a minimum amount of code, and this includes the source code.

As with many of my other projects, I rely on packagers and others who are more into packaging to extend this to other or more full kinds of packaging formats. manylinux is analogous to deb, .rpm, OSX bottles, cargo files, etc.

arcctgx commented 5 months ago

I'm not sure I understand your idea. Assuming I make a separate repository as you suggest, how do you see it being used and, more importantly, by whom?

rocky commented 5 months ago

more importantly, by whom?

People who want manylinux. Often these are called packagers. You would use it around release time, And as a service I'll upload it to PyPI or add it to the github release.

And then when someone comes to you says "hey, instead of manylinux, I'd like a manyosx or a many manywindows version, would you consider expanding this?" you can be in the position of deciding whether to do or not.

In the past, I have had such requests to add a fair bit of code to general software to make it handle somewhat specific targets. In each case, it didn't work out well. I think that is why, in general, there is a distinction between those who packages, and those who write the software. Most software comes with a little bit of packaging. And this software comes with source code which can is what those how want more elaborate forms of packaging can use.

arcctgx commented 5 months ago

People who want manylinux. Often these are called packagers

"People who want manylinux" are those who want to write pip install pycdio or pip install package-that-depends-on-pycdio on the command line, and expect it to just work without having to understand and solve cryptic errors related to missing libcdio headers, SWIG, or whatever else. These people are the end users, not packagers.

I'm not interested in providing this code in a separate repository. Its usefulness would be very limited, and it would not be aligned with established practices of distributing Python packages nowadays.

https://realpython.com/python-wheels/ explains this in detail - please have a look.

Thanks for your time.

rocky commented 5 months ago

Thanks for the information. Currently I have way too many projects that are publicly used and there are many other projects that right now I am more interested in and that people sponsoring my projects are interested in.

Therefore, this packaging feature will have to wait for someone who is willing to not only make the change but be willing to continue to support the feature as packaging and OS's and code changes.