postgis / docker-postgis

Docker image for PostGIS
https://hub.docker.com/r/postgis/postgis/
MIT License
1.37k stars 462 forks source link

ARM-64 build #216

Open derdaele opened 3 years ago

derdaele commented 3 years ago

I'm currently running Apple Silicon machine, which is using the ARM-64 ISA. Currently the postgis image is only built for amd64.

Would it be possible to export ARM-64 builds?

The current workaround is to use community built images (e.g. https://github.com/Duvel/docker-postgis).

derdaele commented 3 years ago

Note: If I build the image from this repository, I can successfully build on ARM-64. This means that the current Dockerfile(s) are already ARM-64 friendly, it's just a matter of uploading the built image to docker hub.

phillipross commented 3 years ago

I believe there is already an open issue around this, but it hasn't seen any activity for a few months: https://github.com/postgis/docker-postgis/issues/144

The state of things has changed a bit since then... most notably, travis is no longer the primary CI for the the docker images. Due to changes in the Travis-CI offering, the CI system was ported over to Github Actions. While Github Actions runners are x86 only, I believe there are actions that have been developed by the community to use qemu or other hypervisors to enable building the arm images using x86 platforms, but I have yet to look into it.

I believe the issue I referenced above mentions raspberry pi explicitly, but building out this repo to work on multiple arm platforms (windows, apple) should probably be the goal. The docker-postgis repo was initially build on top of the official postgres repo, but I have yet to look at how the official postgres repo is doing their arm builds either.

Probably the next logical step should checking out the official postgres repo and seeing how they're doing arm builds, and investigating how to make Github Actions build arm images.

phillipross commented 3 years ago

@derdaele also, are you actually running docker on your Apple Silicon machine? If so, I'd like to know more about how this is accomplished. Thanks!

derdaele commented 3 years ago

Yes, I'm running the developer preview [1].

I tried to look into how the official postgres image is built and it looks like it's using the brewbash project, I'm not sure how easy it would be to integrate this here.

However I stumbled upon this [2] GitHub action that seems to be able to do multi-arch builds.

I'll give it a try and send a PR if it works.

[1] https://github.com/docker/roadmap/issues/142#issuecomment-742795298

[2] https://github.com/docker/setup-buildx-action

phillipross commented 3 years ago

Right, while we don't necessary need to integrate with brewbash, what we probably want to do is analyze things that prevent us from integrating in the future. They have set of patterns and guidelines for the official images (a bit lengthy actually) that we try to keep in mind as we're making changes to this repo.

If the dev preview of docker desktop for M1 is capable of running buildx, then they're further along than I thought. Seeing the progress on this is very encouraging.

I believe this github action from docker was also the one I saw for doing the multiplatform builds. I would say we don't necessarily have to use it, but only that it does exist so building multiplatform should be possible without having to setup any custom github runner. At the moment, there's only one github action flow in this repo which was directly ported from travis, and uses shell scripts that are invoked by github actions rather than using things that are tightly bound to github actions. This prevents us from committing too heavily to the CI product, and allows folks to run the same processes in local linux vms (or to some extent, a macos or windows environment) without the need for some local github actions emulation framework.

It looks as if modifying things to get buildx in the mix is what the next task is to tackle multiplatform builds for this repo. If you'd like to take a stab at that, it would be much appreciated!

derdaele commented 3 years ago

I ran a few tests and results are actually not so great.

The ARM-64 build is only successful for the following combination:

postgres version postgis version variant
12 2.5 default
12 3 default
13 3 default

I identified two error sources:

However the workflow is somewhat cleaner than the current setup (cf https://github.com/derdaele/docker-postgis/blob/master/.github/workflows/main.yml) as it doesn't require maintaining parts of the current Makefile.

I'm happy to start a PR with this workflow and manually include in the matrix the combinations above to have some ARM build if you are good with that.

phillipross commented 3 years ago

@derdaele It's actually good news that ANY builds worked on a github actions runner! It indicates that building arm64 will be possible within the CI. Good work, and thanks!

I think it's already been identified that arm64 packages on debian aren't available for some postgres+postgis combinations, but there's not much that can be done about that since that is handled upstream. I think we can simply exclude them from the build on a per-version-combination basis.

As for alpine, if memory serves, there were reports of successful builds on alpine on raspberry pis (issue #144) but they may be outdated. And I can't remember if this was for the "master branch" builds, release builds, or some other combination. There may be some more work necessary to get it working.

What would be ideal would be a prescribed (scriptable) way to install/configure qemu on an x86 ubuntu environment with buildx and any other necessary tooling to build and push arm images to dockerhub. This would allow allow us to script it on a VM, add it to the setup scripts for github actions, then write/modify github actions workflow(s) to build the images. But contributors would also be able to use the same scripts on a local environment to accomplish the same thing. This will help especially for people who want to contribute PRs without breaking the CI builds too much.

So I guess my immediate question is: would you be willing to attempt to tackle a shell script method of accomplishing docker/buildx/qemu installation... basically duplicating what github actions does with the actions/setup and/or docker buildx setup actions? Integration with the Makefile or templates is not necessary, but just something that is runnable from a shell script.

phillipross commented 3 years ago

@derdaele unless you've already started messing with shell scripts, I'd say it's not actually necessary. There's a little more to it than I thought, so I won't ask you to bother with it.

What I'm seeing now is that the github actions runners might be setup by default with a docker environment that is not sufficient for doing multiplatform builds, so some experimentation is needed to find a good way to get the environment setup. The github action that docker provides to do this seems to accomplish this, but it would be more beneficial to have the setup encapsulated in a script that people can use in their own VMs outside of the github actions environment. I'm in the process of working all this out now, so I won't ask you to duplicate the effort.

When do get these multiplatform images building, i'll push to a test repo and let you know so you can test on the M1 platform. I signed up for the docker development program i hopes of getting access to their M1 builds, but I'm not guaranteed to get approved and get access, so I won't be able to test for myself.

derdaele commented 3 years ago

Hey @phillipross, I did not start working on the shell script already.

SGTM, I'd be happy to give a try to the test images.

kohenkatz commented 3 years ago

I have been trying to troubleshoot the ARM64 builds failing in buildx/QEMU for 11-3.0-alpine (which just got changed to 3.1, but I don't think that matters), and I discovered something very interesting - it works perfectly when building natively on Amazon Graviton (t4g.large) and on Raspberry Pi (4B 2G running a Raspberry Pi OS 64-bit beta from August 2020) but fails in buildx/QEMU (on x86_64) with the following error:

...
#11 1042. /usr/bin/perl -pi -e 's,\$libdir,/usr/src/postgis/regress/00-regress-install/lib,g' /usr/src/postgis/regress/00-regress-install/share/contrib/postgis/*.sql
#11 1043. #/usr/bin/make -C ../loader REGRESS=1 DESTDIR=/usr/src/postgis/regress/00-regress-install install
#11 1043. /usr/bin/make -C core check
#11 1043. make[1]: Entering directory '/usr/src/postgis/regress/core'
#11 1043. /usr/bin/perl ../run_test.pl --extension ../loader/Point ../loader/PointM ../loader/PointZ ../loader/MultiPoint ../loader/MultiPointM ../loader/MultiPointZ ../loader/Arc ../loader/ArcM ../loader/ArcZ ../loader/Polygon ../loader/PolygonM ../loader/PolygonZ ../loader/TSTPolygon ../loader/TSIPolygon ../loader/TSTIPolygon ../loader/PointWithSchema ../loader/NoTransPoint ../loader/NotReallyMultiPoint ../loader/MultiToSinglePoint ../loader/ReprojectPts ../loader/ReprojectPtsD ../loader/ReprojectPtsGeog ../loader/ReprojectPtsGeogD ../loader/Latin1 ../loader/Latin1-implicit ../loader/mfile ../dumper/literalsrid ../dumper/realtable ../dumper/nullsintable ../dumper/null3d affine bestsrid binary boundary chaikin filterm cluster concave_hull ctors curvetoline dump dumppoints empty estimatedextent forcecurve geography geometric_median hausdorff in_geohash in_gml in_kml in_encodedpolyline iscollection legacy long_xact lwgeom_regress measures minimum_bounding_circle normalize operators orientation out_geometry out_geography polygonize polyhedralsurface postgis_type_name quantize_coordinates regress regress_bdpoly regress_buffer_params regress_gist_index_nd regress_index regress_index_nulls regress_management regress_selectivity regress_lrs regress_ogc regress_ogc_cover regress_ogc_prep regress_proj relate remove_repeated_points removepoint reverse setpoint simplify simplifyvw size snaptogrid split sql-mm-serialize sql-mm-circularstring sql-mm-compoundcurve sql-mm-curvepoly sql-mm-general sql-mm-multicurve sql-mm-multisurface swapordinates summary temporal temporal_knn tickets twkb typmod wkb wkt wmsservers offsetcurve relatematch isvaliddetail sharedpaths snap node unaryunion clean relate_bnr delaunaytriangles clipbybox2d subdivide voronoi regress_brin_index regress_brin_index_3d regress_brin_index_geography minimum_clearance oriented_envelope point_coordinates out_geojson frechet geos38 in_geojson regress_spgist_index_2d regress_spgist_index_3d regress_spgist_index_nd mvt mvt_jsonb geobuf
#11 1044. PATH is /usr/local/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
#11 1044. Checking for shp2pgsql ... found
#11 1044. Checking for pgsql2shp ... found
#11 1044. TMPDIR is /tmp/pgis_reg
#11 1044. Creating database 'postgis_reg'
#11 1046. Preparing db 'postgis_reg' using: CREATE EXTENSION postgis
#11 1053. PostgreSQL 11.10 on aarch64-unknown-linux-musl, compiled by gcc (Alpine 9.3.0) 9.3.0, 64-bit
#11 1053.   Postgis 3.0.3 - r0 - 2020-12-21 06:58:19
#11 1053.   scripts 3.0.3 0
#11 1053.   GEOS: 3.8.1-CAPI-1.13.3
#11 1053.   PROJ: 7.0.1
#11 1053.
#11 1053. Running tests
#11 1053.
#11 1053.  ../loader/Point ...2020-12-21 07:15:40.054 UTC [23208] PANIC:  stuck spinlock detected at LWLockWaitListLock, lwlock.c:833
#11 1189. 2020-12-21 07:15:40.054 UTC [23208] STATEMENT:  COMMIT;
#11 1189. qemu: uncaught target signal 6 (Aborted) - core dumped
#11 1189. 2020-12-21 07:15:40.062 UTC [22616] LOG:  server process (PID 23208) was terminated by signal 6: Aborted
#11 1189. 2020-12-21 07:15:40.062 UTC [22616] DETAIL:  Failed process was running: COMMIT;
#11 1189. 2020-12-21 07:15:40.063 UTC [22616] LOG:  terminating any other active server processes
#11 1189. 2020-12-21 07:15:40.065 UTC [22628] WARNING:  terminating connection because of crash of another server process
#11 1189. 2020-12-21 07:15:40.065 UTC [22628] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
#11 1189. 2020-12-21 07:15:40.065 UTC [22628] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
#11 1189. 2020-12-21 07:15:40.065 UTC [23144] WARNING:  terminating connection because of crash of another server process
#11 1189. 2020-12-21 07:15:40.065 UTC [23144] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
#11 1189. 2020-12-21 07:15:40.065 UTC [23144] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
#11 1189.  failed ( wkt test: running shp2pgsql output: /tmp/pgis_reg/loader.err)
#11 1189. 2020-12-21 07:15:40.066 UTC [23144] LOG:  could not send data to client: Broken pipe
#11 1189. 2020-12-21 07:15:40.073 UTC [22616] LOG:  all server processes terminated; reinitializing
#11 1189. 2020-12-21 07:15:40.098 UTC [23214] LOG:  database system was interrupted; last known up at 2020-12-21 07:13:17 UTC
#11 1189. 2020-12-21 07:15:40.128 UTC [23216] FATAL:  the database system is in recovery mode
#11 1189. psql: error: FATAL:  the database system is in recovery mode
#11 1189. Can't return outside a subroutine at ../run_test.pl line 519.
#11 1189.  failed (PostGIS object count pre-test () != post-test (8500))
#11 1189. make[1]: *** [Makefile:231: check] Error 22
#11 1189. make[1]: Leaving directory '/usr/src/postgis/regress/core'
#11 1189. make: *** [Makefile:47: check-regress] Error 2

I wonder what the significance of this is, but I don't really know enough about the PostGIS build process to have a good guess yet.

phillipross commented 3 years ago

I'm seeing the same thing and I believe it's something to do with QEMU. Thanks for extra datapoint though. Thus far I've been trying with an M1 running ubuntu 20.04.01 in parallels and wasn't sure if it was an issue with the parallels (it's a preview build) or something else.

As it stands, I'm able to build, test, and run natively, but trying to do cross-platform with QEMU continues breaking at lower levels that I'm not so familiar with. I've thought about trying to build QEMU from source and seeing if that makes a difference.

notmartinnot commented 3 years ago

Any updates on this issue?

danilo-znamerovszkij commented 3 years ago

https://github.com/docker/for-mac/issues/5122

a comment from stephen-turner This is a qemu bug, which is the upstream component we use for running Intel (amd64) containers on M1 (arm64) chips, and is unfortunately not something we control. In general we recommend running arm64 containers on M1 chips because (even ignoring any crashes) they will always be faster and use less memory.

Please encourage the author of this container to supply an arm64 or multi-arch image, not just an Intel one. Now that M1 is a mainstream platform, we think that most container authors will be keen to do this.

kohenkatz commented 3 years ago

Unfortunately, saying "it's a QEMU bug so use a multi-arch image" without providing any details is not useful, considering that docker buildx requires using QEMU to do the multi-arch build in the first place.

phillipross commented 3 years ago

The comment for https://github.com/docker/for-mac/issues/5122 by @stephen-turner assumes the context only concerns running the containers, but the more pressing issue we're discussing here is actually building the container images themselves. The qemu bug is what is preventing the multiarch images from easily being built within the existing CI/CD framework being used by this repo.

This repo currently uses github actions to build the containers and deploy them to docker hub. Github actions currently only offers amd64 platforms to do builds, so building the arm64 images requires QEMU to build the arm64 images. It's possible to configure github actions to use remote runners on alternate platforms (such as AWS instances or self-hosted environments) but there's a lot of complexity involved in doing that in a robust way.

I was hoping the qemu bug(s) would be resolved soonish or github ations would begin offering arm-based runtimes for github actions, but it seems like it's taking more time than I'd hoped 😕

MonsieurMan commented 3 years ago

Alternative workaround

Until a solution is found to build an arm64 with github actions.

For those like me who does not like to depend upon a community image, another workaround is to simply build the image locally. Either by cloning this repo, or copying the Dockerfile along initdb-postgis.sh and update-postgis.sh of the version you're interested in, and then running docker build.

SystemOfaDrow commented 3 years ago

It doesn't look like there's been any updates on the GitHub actions for awhile, so I wanted to bump this. I was having the same issues as other people with Apple Silicon until I downloaded the files for the version I needed and built my own container image.

phillipross commented 3 years ago

Unfortunately it's still the case that getting the the arm images going in an automated fashion isn't possible with github actions. Running the ARM images for postgresql on intel platforms still doesn't work very well, and github actions only has runners for intel platform at the moment. When they roll out the arm runners, the major hurdles will be clear.

For the time being, folks wanting to run on apple silicon will need to build them locally. We've been holding off on documenting this since it was not clear when the ARM runners would be made available, but it's been 6 or 8 months now that folks have been adopting apple silicon setups so it's probably time to get going on the documentation 😃

irbrad commented 3 years ago

Unfortunately it's still the case that getting the the arm images going in an automated fashion isn't possible with github actions. Running the ARM images for postgresql on intel platforms still doesn't work very well, and github actions only has runners for intel platform at the moment. When they roll out the arm runners, the major hurdles will be clear.

Would a self-hosted runner be an option in the meantime?

Documentation is always a good thing to keep up to date.

phillipross commented 3 years ago

Would a self-hosted runner be an option in the meantime?

I'm not sure how it could be done securely in an automated fashion. The runner would need to be hosted in a trusted environment and ideally on a reliable host/service.

Documentation is always a good thing to keep up to date.

Absolutely! In this case, the documentation would be targeted specifically for folks that would like to build the images on Apple Silicon. And it would only be temporary until the environments for building native arm images becomes available. I guess it could be considered writing documentation for a temporary workaround.

cbaker6 commented 3 years ago

Actions allows you to build cross platform via https://github.com/docker/build-push-action

You can use my example file for reference: https://github.com/netreconlab/parse-hipaa/blob/main/.github/workflows/docker-publish.yml

In which I used to recently build: https://registry.hub.docker.com/repository/docker/netreconlab/parse-hipaa/tags?page=1&ordering=last_updated

phillipross commented 3 years ago

Thanks @cbaker6 but the problem is that the cross platform build uses qemu which currently doesn't function properly with postgres. I personally haven't tested it in a few months, but it may be time to test again to see if newer qemu updates might have solved the problem.

sadams commented 3 years ago

@phillipross any specific issues with qemu? FWIW we have been building our own images as described by @cbaker6 for a few weeks and haven’t noticed anything wrong, but maybe we aren’t using a specific feature of postgis which manifests the problems…

kohenkatz commented 3 years ago

@sadams build errors like this from December 2020. I have not had time to try again since then.

...
#11 1042. /usr/bin/perl -pi -e 's,\$libdir,/usr/src/postgis/regress/00-regress-install/lib,g' /usr/src/postgis/regress/00-regress-install/share/contrib/postgis/*.sql
#11 1043. #/usr/bin/make -C ../loader REGRESS=1 DESTDIR=/usr/src/postgis/regress/00-regress-install install
#11 1043. /usr/bin/make -C core check
#11 1043. make[1]: Entering directory '/usr/src/postgis/regress/core'
#11 1043. /usr/bin/perl ../run_test.pl --extension ../loader/Point ../loader/PointM ../loader/PointZ ../loader/MultiPoint ../loader/MultiPointM ../loader/MultiPointZ ../loader/Arc ../loader/ArcM ../loader/ArcZ ../loader/Polygon ../loader/PolygonM ../loader/PolygonZ ../loader/TSTPolygon ../loader/TSIPolygon ../loader/TSTIPolygon ../loader/PointWithSchema ../loader/NoTransPoint ../loader/NotReallyMultiPoint ../loader/MultiToSinglePoint ../loader/ReprojectPts ../loader/ReprojectPtsD ../loader/ReprojectPtsGeog ../loader/ReprojectPtsGeogD ../loader/Latin1 ../loader/Latin1-implicit ../loader/mfile ../dumper/literalsrid ../dumper/realtable ../dumper/nullsintable ../dumper/null3d affine bestsrid binary boundary chaikin filterm cluster concave_hull ctors curvetoline dump dumppoints empty estimatedextent forcecurve geography geometric_median hausdorff in_geohash in_gml in_kml in_encodedpolyline iscollection legacy long_xact lwgeom_regress measures minimum_bounding_circle normalize operators orientation out_geometry out_geography polygonize polyhedralsurface postgis_type_name quantize_coordinates regress regress_bdpoly regress_buffer_params regress_gist_index_nd regress_index regress_index_nulls regress_management regress_selectivity regress_lrs regress_ogc regress_ogc_cover regress_ogc_prep regress_proj relate remove_repeated_points removepoint reverse setpoint simplify simplifyvw size snaptogrid split sql-mm-serialize sql-mm-circularstring sql-mm-compoundcurve sql-mm-curvepoly sql-mm-general sql-mm-multicurve sql-mm-multisurface swapordinates summary temporal temporal_knn tickets twkb typmod wkb wkt wmsservers offsetcurve relatematch isvaliddetail sharedpaths snap node unaryunion clean relate_bnr delaunaytriangles clipbybox2d subdivide voronoi regress_brin_index regress_brin_index_3d regress_brin_index_geography minimum_clearance oriented_envelope point_coordinates out_geojson frechet geos38 in_geojson regress_spgist_index_2d regress_spgist_index_3d regress_spgist_index_nd mvt mvt_jsonb geobuf
#11 1044. PATH is /usr/local/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
#11 1044. Checking for shp2pgsql ... found
#11 1044. Checking for pgsql2shp ... found
#11 1044. TMPDIR is /tmp/pgis_reg
#11 1044. Creating database 'postgis_reg'
#11 1046. Preparing db 'postgis_reg' using: CREATE EXTENSION postgis
#11 1053. PostgreSQL 11.10 on aarch64-unknown-linux-musl, compiled by gcc (Alpine 9.3.0) 9.3.0, 64-bit
#11 1053.   Postgis 3.0.3 - r0 - 2020-12-21 06:58:19
#11 1053.   scripts 3.0.3 0
#11 1053.   GEOS: 3.8.1-CAPI-1.13.3
#11 1053.   PROJ: 7.0.1
#11 1053.
#11 1053. Running tests
#11 1053.
#11 1053.  ../loader/Point ...2020-12-21 07:15:40.054 UTC [23208] PANIC:  stuck spinlock detected at LWLockWaitListLock, lwlock.c:833
#11 1189. 2020-12-21 07:15:40.054 UTC [23208] STATEMENT:  COMMIT;
#11 1189. qemu: uncaught target signal 6 (Aborted) - core dumped
#11 1189. 2020-12-21 07:15:40.062 UTC [22616] LOG:  server process (PID 23208) was terminated by signal 6: Aborted
#11 1189. 2020-12-21 07:15:40.062 UTC [22616] DETAIL:  Failed process was running: COMMIT;
#11 1189. 2020-12-21 07:15:40.063 UTC [22616] LOG:  terminating any other active server processes
#11 1189. 2020-12-21 07:15:40.065 UTC [22628] WARNING:  terminating connection because of crash of another server process
#11 1189. 2020-12-21 07:15:40.065 UTC [22628] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
#11 1189. 2020-12-21 07:15:40.065 UTC [22628] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
#11 1189. 2020-12-21 07:15:40.065 UTC [23144] WARNING:  terminating connection because of crash of another server process
#11 1189. 2020-12-21 07:15:40.065 UTC [23144] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
#11 1189. 2020-12-21 07:15:40.065 UTC [23144] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
#11 1189.  failed ( wkt test: running shp2pgsql output: /tmp/pgis_reg/loader.err)
#11 1189. 2020-12-21 07:15:40.066 UTC [23144] LOG:  could not send data to client: Broken pipe
#11 1189. 2020-12-21 07:15:40.073 UTC [22616] LOG:  all server processes terminated; reinitializing
#11 1189. 2020-12-21 07:15:40.098 UTC [23214] LOG:  database system was interrupted; last known up at 2020-12-21 07:13:17 UTC
#11 1189. 2020-12-21 07:15:40.128 UTC [23216] FATAL:  the database system is in recovery mode
#11 1189. psql: error: FATAL:  the database system is in recovery mode
#11 1189. Can't return outside a subroutine at ../run_test.pl line 519.
#11 1189.  failed (PostGIS object count pre-test () != post-test (8500))
#11 1189. make[1]: *** [Makefile:231: check] Error 22
#11 1189. make[1]: Leaving directory '/usr/src/postgis/regress/core'
#11 1189. make: *** [Makefile:47: check-regress] Error 2
cbaker6 commented 3 years ago

@phillipross @sadams @kohenkatz looking at my hipaa-postgres image it appears many of the QEMU issues may have been resolved as I can build for all images (including linux/arm64/v8) as the postgres docker image builds except for: linux/s390x, linux/arm/v5, linux/arm/v6, linux/arm/v7; with postgis 3.1.4. Note that the hipaa-postgres image typically depends on postgis (this repo), but I modified my docker file to build postgis directly just like the docker file in this repo.

The aforementioned architectures that aren't building seem to have the same issue, E: Unable to locate package postgresql-13-postgis-3 which can be seen here. My assumption is that for some reason those architectures apt-get don't have postgis available or are not built as the same way as the others. I didn't try to build the non-working architectures with postgis <=3.1.3 as they also didn't have pgaudit available which my image needs.

Here's the yml file I used to build in actions.

phillipross commented 3 years ago

@cbaker6 Thanks, this is encouraging! I'll swing back around and try testing with latest versions.

niccolomineo commented 3 years ago

I used to be able to run postgis using the gangstead/postgis:13-3.1-arm image only. I am testing the postgis/postgis:13-3.1 image these days and, so far, so good. I'll report any issue in the event.

edit: nope, still crashing with Django tests.

Schermata 2021-09-09 alle 09 19 53
odidev commented 2 years ago

Hi Team,

I am building mdillon/postgis image for both AMD64 and ARM64 platforms. I have modified the Makefile and .github/workflows/main.yml file to release the docker image for some of the latest versions of mdillon/postgis for both the platforms using buildx. Although, it is taking a lot of time to build and deploy the docker image and thus the build is failing for master versions.

Dockerhub image link: https://hub.docker.com/repository/docker/odidev/postgis/tags?page=1&ordering=last_updated

Changes required: https://github.com/odidev/docker-postgis/commit/836936afba75a8589a3bf92b45ad4e772ba50af3

Do you have any plans for releasing arm64 images? It will be very helpful if an arm64 image is available. If interested, I will raise a PR.

phillipross commented 2 years ago

@odidev the demand for arm64 images is relatively high, so we do have plans to eventually release when/if qemu issues are worked out or github begins providing arm runners for github actions. If you can provide a PR, that might be helpful, at least as a starting point. Thanks!

alanivey commented 2 years ago

If it helps anyone else until this is resolved; we are building multi-arch images every week from the upstream Debian version and posting to GHCR: https://github.com/baosystems/docker-postgis/pkgs/container/postgis . The following are available:

You can see at https://github.com/baosystems/docker-postgis/blob/multiarch/.github/workflows/multiarch.yml that all we're doing is using the latest master branch code from this project and building the Dockerfile that installs PostGIS from Debian packages.

wingback commented 2 years ago

Hey, is a self-hosted ARM runner acceptable for solving this issue? Check this Github Actions: Self-hosted runners on ARM architectures

phillipross commented 2 years ago

self-hosted may be the way to go, but we had been hoping to avoid the extra overhead in complexity/maintenance and wait for GA to roll out ARM builders. It's now becoming evident that it's taking for this feature to become available than we thought. The security implications and self-hosted runners bring into the mix aren't exactly trivial, let alone finding solid infrastructure to host the runners on. I don't think cloud providers' free tiers would be enough to host the runners, so we'd have to piggyback on someone else's accounts or something. Some discussions with OSGeo folks might be in order.

b1rdex commented 2 years ago

Doesn't GitHub already have anything we need to do the build? See https://github.com/postgis/docker-postgis/issues/216#issuecomment-981824739 and https://github.com/baosystems/docker-postgis/blob/multiarch/.github/workflows/main.yml?rgh-link-date=2021-11-29T16%3A57%3A44Z particularly.

phillipross commented 2 years ago

Doesn't GitHub already have anything we need to do the build? See #216 (comment) and https://github.com/baosystems/docker-postgis/blob/multiarch/.github/workflows/main.yml?rgh-link-date=2021-11-29T16%3A57%3A44Z particularly.

The provided actions rely on buildx's ability to utilize qemu to emulate arm64 on the x86_64 platform. This works for many things, but there are some images in which it hangs or errors out for various reasons. The baosystems repo is a scaled down version of our repo. Here we provide older supported combinations of postgres with postgis as well as alpine.

We have begun working on an alternate build system which could be used to isolate out older images and mix and match build platforms on a per postgres version, postgis version, and underlying os. Said system would then feed the correct platform parameters to the buildx action and bypass combinations that are problematic. No ETA on this though 😬

xmath279 commented 2 years ago

self-hosted may be the way to go, but we had been hoping to avoid the extra overhead in complexity/maintenance and wait for GA to roll out ARM builders. It's now becoming evident that it's taking for this feature to become available than we thought. The security implications and self-hosted runners bring into the mix aren't exactly trivial, let alone finding solid infrastructure to host the runners on. I don't think cloud providers' free tiers would be enough to host the runners, so we'd have to piggyback on someone else's accounts or something. Some discussions with OSGeo folks might be in order.

I don't know if it can help, but I'm pretty sure Oracle Cloud free Ampere VM would work fine for that.

Ampere A1 Compute instances (Arm processor): All tenancies get the first 3,000 OCPU hours and 18,000 GB hours per month for free for VM instances using the VM.Standard.A1.Flex shape, which has an Arm processor. For Always Free tenancies, this is equivalent to 4 OCPUs and 24 GB of memory.

phillipross commented 2 years ago

@xmath279 that's actually worth looking into, thanks!

kuzmich321 commented 2 years ago

It's almost 2022. Any update on this?

UPD: @odidev You're the lifesaver

UPD v2.0: I was having test execution time issues since we started using postgis and it usually took 20 minutes to finish them. Strange. No one of my co-workers ever experienced it 'cause no one had m1 except me. After digging a bit it turned out that our poor db transaction handling has been the issue and after fixing those transactions my tests came back to normal time execution which is 2-3 minutes or so

bencooper222 commented 2 years ago

For what it's worth, I've had no trouble building any of the images using Docker's built-in QEMU emulation. The alpine images' building take many many hours to run (emulation is slow!) but it does work.

Something I've had to do to fix it is remove all mentions of "quiet" logging otherwise CircleCI (and I imagine any CI platform) will exit after 10 minutes of no output when it's really just a ~2 hour apt-get command.

marianhlavac commented 2 years ago

It's already 2022. Any update on this?

RobSchilderr commented 2 years ago

Keep me updated

dgaitan commented 2 years ago

April 20. Any updates on this?

Komzpa commented 2 years ago

Hi @marianhlavac @RobSchilderr @dgaitan and the other people liking posts on this thread,

PostGIS is a volunteer open source project. You are commenting on a ticket on a github repo that contains the code used to build and push the images. We can accept a patch from you that will implement the build process for ARM-64 on github actions.

If this blocks your work somehow and you're looking to implement it faster than a volunteer open source developer with a shiny new macbook implements it then you can take a look at https://postgis.net/support/ for the list of support vendors for PostGIS that can implement this on commercial basis.

n0rdlicht commented 2 years ago

@Komzpa happy to donate a self-hosted runner. Either personally or via @cividi or setup a fund via OpenCollective/GitHub Sponsors... However, the discussion above seemed fairly ideological and doesn't provide any guidance on what a compliant self hosted runner would need to fulfil to be deemed worthy. Could you elaborate what "security implications" need to be solved/met for this to be an option?

E.g. we could easily provide a git-ops based argo-cd/argo-workflow managed host by the community run in a kubernetes cluster with the provider of your choice, e.g. Exoscale, Scaleway, ungleich.ch, AWS, Azure, Google, ...

phillipross commented 2 years ago

@Komzpa thanks for adding context and clarifying

@n0rdlicht Thanks for offering to help move this forward. All options are open for discussion, community involvement/assistance is greatly appreciated! In prior comments I've personally made regarding "security implications" I was more or less referring to ANY security implications pertaining to distributed CI workflows to any cloud providers (github, aws, dockerhub, et al). It might be more constructive to discuss these topics via mailing list or discussion forum rather than a github issue, as I believe many people may need to provide input on various concerns.

Would it be possible for you to provide a proposal for how we might be able to leverage the various methods/frameworks/services/solutions you are mentioning? It might be a matter of educating people on what gitops, argo, and kubernetes clusters are or where they fit in, but the information would be truly appreciated. And a proposal outlining how funding options might work would probably also help. Thanks to all!

Komzpa commented 2 years ago

The simplest way to solve this is for some one to step up on postgis-devel mailing list (https://lists.osgeo.org/mailman/listinfo/postgis-devel) and volunteer to become maintainer of arm build here. If Project Steering Committee will be ok with your candidacy we'll just grant you enough access rights to do it any way you see fit.

marianhlavac commented 2 years ago

For those who just arrived to this issue, I took the effort to read the thread again carefully and from what I understand, this is the current situation, correct me if I'm wrong:

But taken all these points in consideration — there are numerous community images for ARM successfully building latest versions of PostGIS for ARM on GHA.

So generally it seems that the main issue here is that there are incompatible version combinations and it seems that this project aims to provide all of them built for ARM arch, which leads me to question if we really need all version combinations to be built for ARM.

What prevents us to define which version combinations cannot be built using QEMU and filtering them out of the build step and stating in README that not all versions are provided for all architectures? I think most of the people evidently wants the latest versions and are satisfied with the currently available community images (otherwise this issue would be much more heated), which are obviously possible to build, since the images exist.

phillipross commented 2 years ago

@marianhlavac Thank you for taking the time to re-summarize, and I'd say your list of bullet points is articulate the primary points/concerns.

A proposal that @ImreSamu, myself, and others have been discussing involves reworking the build system to have a finer-grained configuration mechanism similar to the latest versions of the docker-library postgres images. This involves bash scripts using the jq utility to parse a versions.json file containing declarative configuration which builds all the various dockerfiles, init scripts, test scripts, etc from templates. This would allow explicitly specifying the build platforms for each image to be built, where only the known version+variant combinations known to work successfully would be enabled.

mylaluna commented 2 years ago

For the record, I tried to install PostGIS manually in Github Action based on a general postgresql 14.4 docker image which can be either arm or amd. Following is the action:

    - name: Install PostGIS
      run: |
        sudo apt-get update && sudo apt-get install -y --no-install-recommends postgresql-14-postgis-3=3.2.1+dfsg-1.pgdg110+1 postgresql-14-postgis-3-scripts
        sudo rm -rf /var/lib/apt/lists/*

This installation was tested successful on my Mac M1 but failed on Github Action.

The error is:

E: Unable to locate package postgresql-14-postgis-3
E: Unable to locate package postgresql-14-postgis-3-scripts

It seems to me that the Github Ubuntu apt-get update is not able to get the resource link which contains PostGIS package.

phillipross commented 2 years ago

@mylaluna this is an interesting datapoint. When you test on the M1, are you using Docker Desktop for MacOS? or using a hypervisor like Parallels, QEMU, or Multipass?

mylaluna commented 2 years ago

@mylaluna this is an interesting datapoint. When you test on the M1, are you using Docker Desktop for MacOS? or using a hypervisor like Parallels, QEMU, or Multipass?

Docker Desktop.