NYCPlanning / data-engineering

Primary repository for NYC DCP's Data Engineering team
24 stars 1 forks source link

Upgrade gdal to 3.9.x #843

Open fvankrieken opened 6 months ago

fvankrieken commented 6 months ago

And more importantly, get off of my fork so we have one less thing to maintain

I moved us to a fork of gdal because of issues around handling of null dates (see #513 and https://github.com/fvankrieken/gdal/commit/fe3bd65448a8e15bcdb870d2a15e374c9a6e1bf4). I've tested gdal 3.9.0 (convert shapefile to pgdump with null dates locally), and it seems like this issue has been resolved. Don't see it explicitly in any release notes but my other bug is in 3.8.4 so that's nice!

Might be worth waiting till 3.9.1 just to avoid a potentially buggy x.x.0 release

This would need

fvankrieken commented 6 months ago

Not entirely sure how rigorously we'd want to validate - I think in general we want to trust that gdal minor versions aren't going to break our pipelines. But worth at least making sure things run without error.

fvankrieken commented 2 months ago

Blocked by library. #1014 has a tentative approach to handle the issue, but in gdal 3.9.x the way we rename fields is no longer possible with the python bindings