heroku / heroku-geo-buildpack

37 stars 26 forks source link

Link to libkml for improved KML support and compatibility with Google Earth #2

Closed thclark closed 4 years ago

thclark commented 4 years ago

Hi @KevinBrolly great work!

About this new buildpack: Thankyou! If we can get this issue solved, it'll keep me on heroku (I hate to leave, but have been really struggling to get GDAL installed with libkml linked, so have been dockerizing and moving to Google Cloud, which is a nightmare to set up!).

WHAT I'D LOVE If the version of GDAL vendored into this buildpack was linked against google's libkml library

WHY GDAL already has a default KML file parser, so why build it with an additional engine for parsing KML files?

HOW If it helps, here are relevant parts of a Dockerfile that adds a version of GDAL with libkml included to the heroku stack (it'll obviously be a bit different for you, since you'll build proj and geos etc, but hopefully is useful):

FROM heroku/heroku:18-build as build

# ...  stuff ...

# These libraries should always be done first. GDAL takes *ages* to build, so if done later in the Dockerfile, then any
# change that invalidates the cache will trigger an extremely long rebuild.

ARG GDAL_VERSION=v2.4.1

# Install proj, geos, libkml and build tools needed to compile gdal
RUN apt-get update -y && apt-get install -y --fix-missing --no-install-recommends \
        libkml-dev libproj-dev libgeos-dev \
        curl autoconf automake bash-completion

RUN ldconfig

# GDAL has a range of install options. Most of them specialized. Some add a lot of size to the slug, so be careful! Some options (may not be exhaustive):
#       python3-dev python3-numpy libboost-dev  libpng-dev libjpeg-dev libgif-dev \
#       libcharls-dev libopenjp2-7-dev libcairo2-dev \
#       liblzma-dev curl libcurl4-gnutls-dev libxml2-dev libexpat-dev libxerces-c-dev \
#       libnetcdf-dev libpoppler-dev libpoppler-private-dev \
#       libspatialite-dev swig libhdf4-alt-dev libhdf5-serial-dev \
#       libfreexl-dev unixodbc-dev libwebp-dev libepsilon-dev \
#       liblcms2-2 libpcre3-dev libcrypto++-dev libdap-dev libfyba-dev \
#       libmysqlclient-dev libogdi3.2-dev \
#       libcfitsio-dev openjdk-8-jdk libzstd1-dev \
#       libpq-dev libssl-dev

# Build GDAL
RUN mkdir gdal \
    && curl -L https://github.com/OSGeo/gdal/archive/${GDAL_VERSION}.tar.gz | tar xz -C gdal --strip-components=1 \
    && cd gdal/gdal \
    && ./configure \
        --without-libtool \
        --with-geos=yes \
        --with-libkml \
        --with-proj \
    && make \
    && make install \
    && cd ../.. \
    && rm -rf gdal

RUN ldconfig

AN OFFER I'm not that familiar with heroku buildpacks, but have ahem "become" familiar with building GDAL. If it'll save me porting everything to google cloud, I'll happily spend a day of dev time helping with this in any way I can. Just ping me.

CaseyFaist commented 4 years ago

😂 "become" familiar, I know a little about what that feels like

Your work looks more thorough than what I found after a quick google; we'd probably want to move the steps into formulas and trigger files but the steps will likely look similar. If you want to take a crack at formulize-ing it, go for it 💯

KevinBrolly commented 4 years ago

Hey @thclark I have added libkml support in this commit - https://github.com/heroku/heroku-geo-buildpack/commit/7330193bfe76d9ead89c675b1ddf1460d07db9b3

Could you give it a test and let me know how you get on? If you are already using this buildpack in an application the new binary should get pulled down the next time you do a build.

I have done it slightly differently than you have in your PR (https://github.com/heroku/heroku-geo-buildpack/pull/4) as it is faster and easier to just compile libkml and its dependencies to a custom prefix and shipping that over with the gdal binary rather than trying to apt-get during the build phase.

thclark commented 4 years ago

Getting there! Great work @KevinBrolly - thanks for your efforts on this.

This builds and releases, but unfortunately, the libraries aren't found at runtime... I have an error that looks like:

~ $ gdalinfo
gdalinfo: error while loading shared libraries: libkmlbase.so.1: cannot open shared object file: No such file or directory

I've just made my test app public which deploys successfully to heroku (use this buildpack and heroku/python), so you're welcome to use that to help.

I shelled in to that deployed app, and found the other libraries hd been deployed by the buildpack, but not libkml. Perhaps there's a build cache somewhere that needs to flushed?

KevinBrolly commented 4 years ago

Hey @thclark - Sorry, looks like I forgot to include the libkml files in the gdal package at runtime 🤦‍♂

Thanks for the test app, that's really helpful. I will let you know when libkml is properly deployed.

KevinBrolly commented 4 years ago

Hey @thclark I have updated the gdal packages for the heroku-18 stack (gdal takes an age to build!). I spun up a dyno with your test app and it looks good:

~ $ gdalinfo
Usage: gdalinfo [--help-general] [-json] [-mm] [-stats] [-hist] [-nogcp] [-nomd]
                [-norat] [-noct] [-nofl] [-checksum] [-proj4]
                [-listmdd] [-mdd domain|`all`]*
                [-sd subdataset] [-oo NAME=VALUE]* datasetname

Let me know if that is working for you now.

thclark commented 4 years ago

Woohooo! This is great, working well. Thankyou so much @KevinBrolly - this issue has been completely killing my project.

Perhaps the following will be useful to add to the README (maybe edited slightly ;) ):

Notes for switching over from using the GDAL that gets vendored with heroku/python

You have to completely remove the BUILD_WITH_GEO_LIBRARIES environment variable :

You have to flush your application build cache.

You have to monkey around with the Heroku CI cache:

Phew! Job done! Thank you @KevinBrolly and @CaseyFaist you're my heroes of the month, and I'll allow you the honour of closing this issue ;)

KevinBrolly commented 4 years ago

Thanks for the README suggestions @thclark that will be really useful for people switching over, I will get that in the README now.

You are right that the Heroku CI cache functionality could be documented better.

Heroku CI does have a separate build cache for CI builds. The way the "Run Again Without Cache" works in Heroku CI is that there is a single cache for your test run, if you click "Run Again Without Cache" then at the end of that run the build cache from that run replaces the previous build cache.

This is why you have to wait for the tests to fail, then run again without cache, then after that run the new cache from that build will be used and your tests then pass. I will make a note to document this better.