Open banesullivan opened 2 years ago
The issue is resource time. Currently, running the docker build on Circle-CI takes about 4 hours. On a local high-end machine, it takes just under an hour. I think we'd exhaust the free tier resources pretty quickly. The actual change would be pretty easy, though, since conceptually, we are just doing a "docker build".
Currently, there is a cron job that runs on a local machine nightly that auto updates versions. When a wheel release occurs, there is a single manual "git push" that has to be done to the gh-pages branch. The CI is currently just used as a verification rather than the release process, since it has been brittle in the past do to excessive build times.
Wowsers, I just looked at that Dockerfile for the first time.... 🤯
Could we break this up into separate Dockerfiles / Actions that run for each lib? For example:
Then we have an Action that checks for releases and only build the sinlge lib when necessary?
I see a lot of utility in abstracting this a bit to also help with building VTK python wheels
And creating a "stack" of Docker images with these TPLs that could live in this repo's container registry
Also, what we're doing here is basically what conda-forge provides but on a small scale... I understand these wheels are mostly used in production environments where we might not want the weight of anaconda but have we ever had a chance to really evaluate using miniconda and just contributing to the conda-forge feedstocks so we and the bigger community benefit?
Also, what we're doing here is basically what conda-forge provides but on a small scale... I understand these wheels are mostly used in production environments where we might not want the weight of anaconda but have we ever had a chance to really evaluate using miniconda and just contributing to the conda-forge feedstocks so we and the bigger community benefit?
We had resistance to using conda on some of our projects -- conda has a higher barrier to entry than pip. I view that the bigger community benefit is to provide pip-compatible wheels.
Could we break this up into separate Dockerfiles / Actions that run for each lib? For example:
- A docker file and Action dedicated to producing only GDAL wheels
- ... only the pyvips wheels
- etc
So far, every time I've contemplated breaking this into multiple wheels, I've felt that it will substantially increase the maintenance burden. I'd love to be proved wrong about that. A bunch of the packages need a newer version of glib and gobject-introspection. mapnik depends on all of gdal except the python package. Instead of the long litany of commands in one Dockerfile, I don't see how to end up without a cascade of partial docekrs that itself is hard to think about.
I've contemplated making each library or package a separate file, then the individual packages end up being a list of simple-looking instructions "make_boost.sh", "make_gdk-pixbuf.sh", etc. If each of those is on a RUN command by itself, it would be more obvious what is needed and Docker might cache appropriately.
I should add that I think conda's feedstock approach is actually better than pip wheels, but that pip has much higher acceptance than conda.
Is this a big ask? I can help/do this and try to abstract the build process into a few isolated Actions here