pyproj4 / pyproj

Python interface to PROJ (cartographic projections and coordinate transformations library)
https://pyproj4.github.io/pyproj
MIT License
1.05k stars 212 forks source link

Cache PROJ library build when building wheels #1345

Open djhoese opened 1 year ago

djhoese commented 1 year ago

As discussed in #1342 PROJ building takes about 10 minutes at the top of every wheel build workflow. This PR caches it with GitHub Actions to hopefully remove this requirement except for when PROJ (or its dependencies) are updated.

This workflow is taken from shapely's own CI.

djhoese commented 1 year ago

I swear I write enough bash to not have made the mistakes I made in this PR. I don't think my coffee kicked in yet.

codecov[bot] commented 1 year ago

Codecov Report

Merging #1345 (a4aba7e) into main (3cea713) will not change coverage. The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1345   +/-   ##
=======================================
  Coverage   96.41%   96.41%           
=======================================
  Files          20       20           
  Lines        1812     1812           
=======================================
  Hits         1747     1747           
  Misses         65       65           
djhoese commented 1 year ago

Damn I was so sure I had this working. It even restored the directory...but then it didn't use it.

djhoese commented 1 year ago

Not sure why but throughout my testing the PROJ building has been improving in time from ~600s to ~480s. Maybe some download caching inside github? Not really sure.

Watching the current builds it seems like the cache is working.

djhoese commented 1 year ago

...and nevermind it needs libtiff and sqlite too. I was only caching the PROJ directory.

djhoese commented 1 year ago

@snowman2 FYI this works now, but has a lot of debug and ugly ways of doing things just as a proof of concept.

I'm very surprised at the variance in the build times both of PROJ itself and the pyproj wheel building. They seems to very by quite a bit. I've seen PROJ go between ~4 minutes to ~6 minutes through my testing.

One of the biggest surprises but one that still makes sense was that caching the libtiff and other PROJ deps in the cache directory meant I had to put it in LD_LIBRARY_PATH, but only on the linux system. Otherwise, auditwheel was failing to find them. One solution I thought of was to add the cache directory to LD_RUN_PATH versus depending on LD_LIBRARY_PATH so that I didn't need to carry aroundthis extra library path (it would be in the PROJ .so). My understanding is that auditwheel should bundle all the depended on libraries anyway so this shouldn't really matter. Does any of this concern you? Or do you have other things you'd like to see as I work on cleaning this code up?

snowman2 commented 1 year ago

Thanks @djhoese 👍 - I am looking forward to this update. I am busy this week, but should have a moment next week to take a look.

snowman2 commented 1 year ago

No concerns from my end. This change makes sense to me. Thank you for working on this :+1:

djhoese commented 1 year ago

I just wish it was a bigger win for build times than it is.