postgis / docker-postgis

Docker image for PostGIS
https://hub.docker.com/r/postgis/postgis/
MIT License
1.39k stars 466 forks source link

Add pcre library to templates, pcre is a dependency for address_standardizer extension #220

Closed rahul-ve closed 3 years ago

rahul-ve commented 3 years ago

@ImreSamu This should fix #219
Not sure if there are any automated tests before merge but I tested the changes locally with builds for Alpine (13-3.1) and Debian (13-master)

Thanks Rahul

rahul-ve commented 3 years ago

looks like an upstream breaking change related to geos. upstream commit causing this issue I am not familiar with geos builds, any suggestions on how to fix this?

ImreSamu commented 3 years ago

I am not familiar with geos builds, any suggestions on how to fix this?

honestly, I don't know ..

imho:

some comments ..

1.) testing

if you can / please / not mandatory :

2.) check/minimize docker image size ..

the universal community wish .. we need to minimize the docker image sizes as we can ...

rahul-ve commented 3 years ago

Let me check with geos devs, hopefully it should be straight forward to use new build instructions.

1.) testing

will do

2.) check/minimize docker image size ..

  • please create some statistics about the new images sizes .. ( old size vs. new size )

Yes, I noticed the debian image is huge compared to alpine. When you say create some stats, is there an automated approach to this or do you want me to manually build and eyeball the numbers?

ImreSamu commented 3 years ago

Let me check with geos devs,

thanks ...
solving the "root case" .. always better :)

When you say create some stats, is there an automated approach to this or do you want me to manually build and eyeball the numbers?

no automated test yet ...

so as simple as you can ..
imho: just a simple comparisons based on docker images - local sizes ( old size MB vs. new size MB ) .. so the community can check the differences by own eyeball ..

now this is the current postgis 13-* sizes in my local machine ; the 13-master image size is 853MB

$ docker images | grep postgis
postgis/postgis                13-master         99147be4ecc1   3 days ago      853MB
postgis/postgis                13-3.1-alpine     066c268fadb1   3 days ago      292MB
postgis/postgis                13-3.1            3b74bd4c6e01   3 days ago      491MB

or just see this minimal check

rahul-ve commented 3 years ago

geos build is fixed by using cmake steps but proj upstream changes are also causing the build to fail. error tail is:

Using user-specified proj directory: /usr
checking proj_api.h usability... no
checking proj_api.h presence... no
checking for proj_api.h... no
checking proj.h usability... yes
checking proj.h presence... yes
checking for proj.h... yes
checking for proj.h... (cached) yes
checking for pj_get_release in -lproj... no
configure: error: could not find libproj - you may need to specify the directory of a PROJ installation using --with-projdir

I specified --with-projdir=/usr for postgis configure, still no dice. There is libproj-dev headers,, when I added that to first stage build, GDAL build was failing.

I am a bit out of my depth, maybe someone with bit more understanding of how these components interact can suggest way forward?

ImreSamu commented 3 years ago

how these components interact can suggest way forward?

imho:

$ git diff Dockerfile.master.template

diff --git a/Dockerfile.master.template b/Dockerfile.master.template
index 168a71b..17f3df7 100644
--- a/Dockerfile.master.template
+++ b/Dockerfile.master.template
@@ -76,9 +76,11 @@ RUN set -ex \
     && cd / \
     && rm -fr /usr/src/SFCGAL

-# proj4
-ENV PROJ_VERSION master
-ENV PROJ_GIT_HASH %%PROJ_GIT_HASH%%
+### ENV PROJ_VERSION master
+### ENV PROJ_GIT_HASH %%PROJ_GIT_HASH%%
+# proj - lock 7.2.1
+ENV PROJ_VERSION 7.2.1
+ENV PROJ_GIT_HASH 1212e9b818e4511cc9389b9bdb5daa0bec1a12bd

 RUN set -ex \
     && cd /usr/src \
@@ -101,8 +103,9 @@ RUN set -ex \
     && git clone https://github.com/libgeos/geos.git \
     && cd geos \
     && git checkout ${GEOS_GIT_HASH} \
-    && ./autogen.sh \
-    && ./configure --disable-static \
+    && mkdir build \
+    && cd build \
+    && cmake .. \
     && make -j$(nproc) \
     && make install \
     && cd / \
@@ -211,7 +214,7 @@ RUN set -ex \
     && cd \
     # postgis
     && cd /usr/src/ \
-    && git clone https://git.osgeo.org/gitea/postgis/postgis.git \
+    && git clone https://github.com/postgis/postgis.git \
     && cd postgis \
     && git checkout ${POSTGIS_GIT_HASH} \
     && ./autogen.sh \

// the https://github.com/postgis/postgis.git repo changes .. just a temporary .. the original repo is not worked for me ..

rahul-ve commented 3 years ago

@ImreSamu I will try to push the update soon... As a first time contributor to this repo, if I may suggest an improvement, why not separate out the process of moving to the head of all upstream dependencies? That way when trying to fix a specific issue disruption from upstream changes can be minimized.

i.e., something like:

phillipross commented 3 years ago

@ImreSamu I will try to push the update soon... As a first time contributor to this repo, if I may suggest an improvement, why not separate out the process of moving to the head of all upstream dependencies? That way when trying to fix a specific issue disruption from upstream changes can be minimized.

i.e., something like:

  • record last known upstream commits to work and use these for all dev activity
  • on a schedule try updating the build with the head commits and if succeeds, update the commit hashes to be used for other dev work. I believe this step can be completely automated to run, say, every week/fortnight/month. Given the rapid upstream dev cycle, I would expect this to fail more often than not. At least if it does, it can be fixed proactively.

I like this idea... or at least the goal of the idea. Without looking too closely, it seems like it would be ideal to factor out the logic for updating the heads to a separate update script. Maybe the "master" images become the most recent known to be somewhat working versions "alpha/beta" and what is known to be master now becomes "testing" or "unstable" or something like that.

I've enabled "discussions" on this repo. Maybe we can start a new discussion there to further discuss ideas around the suggestion from @rahul-ve

ImreSamu commented 3 years ago

I will try to push the update soon...

OK :+1:

side note:

2021/01/28: The PostGIS 3.1.1 has been released and after the debian repo will be updated ( apt.postgresql.org ) .. our current Docker build scripts will crash
so we need to run the make update ( ~ maybe this weekend / next week ) imho: in the next few days .. we need to discuss the temporary fix .. ( ~ locking the proj version ; or not updating the 'master' ; or ... )

As a first time contributor to this repo, if I may suggest an improvement,

@phillipross is the primary maintainer ; and I am also open .. so just create a proposal/discussion about your new ideas :+1:

rahul-ve commented 3 years ago

@ImreSamu tried fixing PROJ ver to 7.2.1 as you did and building the image, fails a regression test.

./core/out_geojson .. ok in 393 ms
 ./core/computed_columns .. ok in 2191 ms
 ./core/frechet .. ok in 38 ms
 ./core/geos38 .. ok in 85 ms
 ./core/geos39 .. ok in 45 ms
 ./core/fixedoverlay .. failed (diff expected obtained: /tmp/pgis_reg/test_102_diff)
 ./core/in_geojson ..2021-01-29 15:10:56.461 UTC [16279] ERROR:  Only lon/lat coordinate systems are supported in geography. at character 71
...

Run tests: 141
Failed: 1
make: *** [runtest.mk:10: check-regress] Error 1

I pushed the changes, may be you can check! diff with master:

diff --git a/Dockerfile.master.template b/Dockerfile.master.template
index 168a71b..5d1af87 100644
--- a/Dockerfile.master.template
+++ b/Dockerfile.master.template
@@ -56,7 +56,8 @@ RUN set -ex \
       make \
       pkg-config \
       protobuf-c-compiler \
-      xsltproc
+      xsltproc \
+      libpcre3-dev

 # sfcgal
 ENV SFCGAL_VERSION master
@@ -76,9 +77,15 @@ RUN set -ex \
     && cd / \
     && rm -fr /usr/src/SFCGAL

-# proj4
-ENV PROJ_VERSION master
-ENV PROJ_GIT_HASH %%PROJ_GIT_HASH%%
+# proj
+###ENV PROJ_VERSION master
+###ENV PROJ_GIT_HASH %%PROJ_GIT_HASH%%
+
+### Issue with proj v8 not being compatible
+### see https://github.com/postgis/docker-postgis/pull/220#issuecomment-765864268
+### Below is a temp fix
+ENV PROJ_VERSION 7.2.1
+ENV PROJ_GIT_HASH 1212e9b818e4511cc9389b9bdb5daa0bec1a12bd

 RUN set -ex \
     && cd /usr/src \
@@ -101,8 +108,9 @@ RUN set -ex \
     && git clone https://github.com/libgeos/geos.git \
     && cd geos \
     && git checkout ${GEOS_GIT_HASH} \
-    && ./autogen.sh \
-    && ./configure --disable-static \
+    && mkdir cmake-build \
+    && cd cmake-build \
+    && cmake -DCMAKE_BUILD_TYPE=Release .. \
     && make -j$(nproc) \
     && make install \
     && cd / \
@@ -133,7 +141,8 @@ RUN set -ex \
     && geos-config --version \
     && ogr2ogr --version \
     && proj \
-    && sfcgal-config --version
+    && sfcgal-config --version \
+    && pcre-config  --version

 FROM postgres:%%PG_MAJOR%%

@@ -162,6 +171,7 @@ RUN set -ex \
       libtiff5 \
       libxml2 \
       sqlite3 \
+      libpcre3-dev \
     && rm -rf /var/lib/apt/lists/*

 COPY --from=builder /usr/local /usr/local
@@ -174,7 +184,8 @@ RUN set -ex \
     && geos-config --version \
     && ogr2ogr --version \
     && proj \
-    && sfcgal-config --version
+    && sfcgal-config --version \
+    && pcre-config  --version

 # install postgis
 ENV POSTGIS_VERSION master
@@ -219,6 +230,7 @@ RUN set -ex \
 # https://anonscm.debian.org/cgit/pkg-grass/postgis.git/tree/debian/rules?h=jessie
     && ./configure \
 #       --with-gui \
+        --with-pcredir="$(pcre-config --prefix)" \
     && make -j$(nproc) \
     && make install \
 # regress check
@@ -256,6 +268,7 @@ RUN set -ex \
       libtiff-dev \
       libtool \
       libxml2-dev \
+      libpcre3-dev \
       make \
       pkg-config \
       postgresql-server-dev-$PG_MAJOR \
ImreSamu commented 3 years ago

tried fixing PROJ ver to 7.2.1 as you did and building the image, fails a regression test.

ouch ... the ~ last time it was OK for me ...

pragmatic proposal: Try also locking the GEOS master in the template ( like the PROJ )

 # geos
 ENV GEOS_VERSION master
-ENV GEOS_GIT_HASH b2e7efc30a979954bf7dfa8fb252473afcd509b8
+# lock Commits on Jan 19, 2021
+ENV GEOS_GIT_HASH 98641ab14e01a6b5a6339f49fa6f1bee4424c7d0

With GEOS Commits on Jan 19, 2021 - it is building for me. ( tested only with your latest /13-master/Dockerfile )

ImreSamu commented 3 years ago

imho: this GEOS commit crash the build .. ( "Fix OverlayNG line ordering" Jan 25, 2021 )

rahul-ve commented 3 years ago

@ImreSamu locking GEOS commit fixed the build issues. I added a test case for address_standardizer but make test fails on a prior test.

error:

...
server started 
CREATE DATABASE  
/docker-entrypoint-initdb.d/10_postgis.sh: 9: /docker-entrypoint-initdb.d/10_postgis.sh: Bad substitution 

/usr/local/bin/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/10_postgis.sh
'postgis-basics' [7/7]...  
postgres failed to accept connections in a reasonable amount of time!
docker: Error response from daemon: Cannot link to a non running container: /postgis-container-20638-15716 AS /beautiful_tharp/postgis.
failed
make: *** [Makefile:110: test-13-3.1] Error 1

and the image sizes

postgis/postgis                                13-3.1-alpine                                  84f06c9eb4a9   About an hour ago   299MB
postgis/postgis                                13-3.1                                         3bbd734b9e82   About an hour ago   491MB
postgis/postgis                                13-master_new                                  21a571a7ce22   2 hours ago         840MB
postgis/postgis                                13-master_old                                  20d6cec3613d   3 weeks ago         853MB
ImreSamu commented 3 years ago

@rahul-ve

I added a test case for address_standardizer but make test fails

strange;

it is working for me (local , Docker version 20.10.2, build 2291f61 )
and your last commit - CI make test is OK

ImreSamu commented 3 years ago

@rahul-ve

thank you ! :+1:

I have checked the code .. and I have only a minor comments ..

I am expecting

and if my theory is true ; then the pcre should be (re) added in the alpine .. like the geos

diff --git a/Dockerfile.alpine.template b/Dockerfile.alpine.template
index c9c9051..f81b21e 100644
--- a/Dockerfile.alpine.template
+++ b/Dockerfile.alpine.template
@@ -65,6 +65,7 @@ RUN set -ex \
         geos \
         gdal \
         proj \
+        pcre \
         libstdc++ \
         protobuf-c \
 # clean

and the master should be ~ like this ..

diff --git a/Dockerfile.master.template b/Dockerfile.master.template
index 9157d10..7b7bb97 100644
--- a/Dockerfile.master.template
+++ b/Dockerfile.master.template
@@ -174,7 +174,7 @@ RUN set -ex \
       libtiff5 \
       libxml2 \
       sqlite3 \
-      libpcre3-dev \
+      libpcre3 \
     && rm -rf /var/lib/apt/lists/*

 COPY --from=builder /usr/local /usr/local
@@ -187,8 +187,7 @@ RUN set -ex \
     && geos-config --version \
     && ogr2ogr --version \
     && proj \
-    && sfcgal-config --version \
-    && pcre-config  --version
+    && sfcgal-config --version

 # install postgis
 ENV POSTGIS_VERSION master
@@ -212,6 +211,7 @@ RUN set -ex \
       libgmp-dev \
       libjson-c-dev \
       libmpfr-dev \
+      libpcre3-dev \
       libprotobuf-c-dev \
       libsqlite3-dev \
       libtiff-dev \
rahul-ve commented 3 years ago

@ImreSamu

the pcre (alpine) / libpcre3 (debian - master ) is needed for the "run time" so should be included in the final docker image and the pcre-dev / libpcre3-dev only for the building.

you are correct, libpcre3 is present in the 13-3.1 image. I am using this for testing and it has libpcre3. Added them to final build

ImreSamu commented 3 years ago

@rahul-ve : Thanks!

As I see almost finished .. please run a last make update and please refresh every Dockerfile ...

reason: The debian postgis ( in the apt.postgresql.org repo ) has been updated to 3.1.1+dfsg-1.pgdg+1

rahul-ve commented 3 years ago

@ImreSamu updated to postgis 3.1.1 image sizes:

postgis/postgis                                13-3.1-alpine                                  d8fd61da54a1   20 minutes ago      299MB
postgis/postgis                                13-master                                      62ad4902000f   39 minutes ago      814MB
postgis/postgis                                13-3.1                                         519f1d912beb   2 hours ago         491MB
ImreSamu commented 3 years ago

@phillipross :

I have been checked .. looks OK ( for me ) side node: I am not fully understand why the new master size is smaller .. but the test is passed ..

my summary:

alpine changes

my docker image stat:

alpine version newMB - oldMB diff MB
10-2.5-alpine 196 - 193 +3
10-3.1-alpine 205 - 202 +3
11-2.5-alpine 286 - 281 +5
11-3.1-alpine 295 - 291 +4
12-2.5-alpine 288 - 283 +5
12-3.1-alpine 297 - 293 +4
13-3.1-alpine 299 - 294 +5
9.5-2.5-alpine 194 - 191 +3
9.5-3.0-alpine 203 - 200 +3
9.6-2.5-alpine 195 - 192 +3
9.6-3.1-alpine 204 - 201 +3

master changes

version newMB - oldMB diff MB
12-master 814 - 853 -39
13-master 814 - 853 -39
phillipross commented 3 years ago

@rahul-ve @ImreSamu all this work you've been doing looks great! I also think the size increase in the case of the alpine results are more than acceptable. Thanks also for doing this analysis.

I'm going to do a little bit of testing in one of my environments where I can do some simple smoke tests. I'll provide some feedback here If anything unexpected comes up which I'm not able to resolve, but in general all of this looks fine.

The CI checks are passing at this point, so I guess my next question is whether or not there are any other remaining changes that should go into this PR before it's merged?

ImreSamu commented 3 years ago

@phillipross :

The CI checks are passing at this point, so I guess my next question is whether or not there are any other remaining changes that should go into this PR before it's merged?

imho: nothing "important" on my side .

important note: As I see the new parallel PR ( postgis 3.1.1 update ) has been created: https://github.com/postgis/docker-postgis/pull/225 on the other hand : this changes already included in this PR. ( requested by me ; and this was important for my local build test ) -->> so "Rebasing / git rebase " is expected in this PR

phillipross commented 3 years ago

@phillipross :

The CI checks are passing at this point, so I guess my next question is whether or not there are any other remaining changes that should go into this PR before it's merged?

imho: nothing "important" on my side .

important note: As I see the new parallel PR ( postgis 3.1.1 update ) has been created: #225 on the other hand : this changes already included in this PR. ( requested by me ; and this was important for my local build test ) -->> so "Rebasing / git rebase " is expected in this PR

Yep, I just finished merging PR #225, and am in the process of rebasing this PR. The CI is a little slow to validate because of the time involved in building the master tags, but I'll sort it out and finally merge this PR after it is rebased. Thanks for the vigilance and heads up!

rahul-ve commented 3 years ago

@phillipross, @ImreSamu Thank you