Closed davidbarratt closed 7 months ago
Just finished setting up an AWS EC2 t4g instance - one of their latest and cheapest instance types - only to run into the same problem.
We'd be happy to accept a PR for this, I don't have an ARM Mac so I can't test this unfortunately but we should be able to set it up in the CI env.
All Pelias Docker images extend from the baseimage so once that's configured to do either multiarch or a second baseimage is generated for ARM then we can configure the CI to build them and begin testing the application components.
I suspect that major software vendors such as nodejs/elasticsearch with large teams will already support ARM but things like libpostal might require some work.
Tim Cook himself said it would take some time for ARM to be supported everywhere so we'll need some help from the Pelias community to get it working and tested!
Hey folks,
As a Macbook M1 owner, I wouldn't mind spending a little bit of my personal time making Pelias work on ARM :)
There's going to be a bunch of aspects to this work, and I'll try to list a general overview here. A big thanks to @jlowe000 for his work apparently getting all of Pelias to work on ARM, and his accompanying blog post https://redthunder.blog/2021/07/04/daysofarm-12-of-x/. Definitely points us in the right direction.
As mentioned in https://github.com/pelias/docker/pull/263, we'll want to upgrade the default Elasticsearch image to 7.10.2 or newer to get ARM compatibility. It sounds like this alone might be enough to get Pelias to at least work on an M1 Mac. Lots of things will still use rosetta2
emulation and be slow and battery draining, but it's a start.
amd64
We hardcode amd64
download links for various dependencies across the project, for example the Polylines importer Dockerfile:
https://github.com/pelias/polylines/blob/cb0b382af7c5bd8cfe3b607c6b35f6f7417b24bc/Dockerfile#L8
I'd love to hear what best practices there are for this. I think buildx
(more on that below) will provide an ARCH
environment variable, can we use that?
pbf2json
We'll either need to add arm
binaries to anything that uses pbf2json
(interpolation, openstreetmap), or start compiling from source during Docker builds on arm
buildx
If we want Docker images to be built for the project with arm
support by default then we might want to look at Docker multi-arch builds, presumably using buildx
. I haven't had a ton of luck with this yet, but I'm sure it can be done.
Let me know if I'm missing anything!
Hi @orangejulius, I've been able to get a "version" of this up and going.
I wouldn't say it's been fully tested but it's been very stable for the queries and workloads that I've been working with. The repos are checked in and forked from the pelias versions. The last time that I did a fetch was working on the issue with the interpolation.
A couple of things that I had to update.
There was a comment here that pelias on arm64 worked fine - https://github.com/isaacs/nave/pull/111#issuecomment-900422483. I'm not sure whether they were referring to their repo or pelias.
There are definitely some areas of concern that I hadn't had time to look into deeply. The main one is the Valhalla stack.
regarding pbf2json
, we are already building pbf2json.linux-arm
and bundling it in the npm
module:
path.join(__dirname, 'build', util.format( 'pbf2json.%s-%s', os.platform(), os.arch() ) )
/tmp ❯ ls -lah node_modules/pbf2json/build
total 22792
drwxr-xr-x 6 peter wheel 192B Nov 4 10:02 .
drwxr-xr-x 7 peter wheel 224B Nov 4 10:02 ..
-rwxr-xr-x 1 peter wheel 5.8M Oct 26 1985 pbf2json.darwin-x64
-rwxr-xr-x 1 peter wheel 1.7M Oct 26 1985 pbf2json.linux-arm
-rwxr-xr-x 1 peter wheel 1.9M Oct 26 1985 pbf2json.linux-x64
-rwxr-xr-x 1 peter wheel 1.8M Oct 26 1985 pbf2json.win32-x64
do we need pbf2json.darwin-arm
? I'm assuming not since it will run in a linux docker container? if so, please open an issue on that repo and I'll have a look at how much work it is.
I'm not 100% sure. But I think I had issues with the prebuilt arm version as I am running on 64bit arm.
It was simple enough https://github.com/pelias/pbf2json/pull/107
Add ARM builds for all Docker images.
Does the scope of this issue include derivatives/auxiliaries like pelias/libpostal-service?
Does the scope of this issue include derivatives/auxiliaries like pelias/libpostal-service?
Yes, I think it's all-or-nothing, partial support isn't particularly helpful.
We would be happy to accept contributions, I doubt this will be picked up by the core team as we don't have a need for it.
We would be happy to accept contributions
@missinglink could you point me to the CI that builds and pushes to dockerhub? I have some experience running multi-arch builds from raw dockerfile (buildx build) or from docker compose definition (buildx bake), particularly using QEMU emulator on GitHub Actions CI
Each repo has its own CI script such as this:
https://github.com/pelias/api/blob/master/.github/workflows/push.yml#L42
Rather than repeat everything per-repo it just executes this script:
https://github.com/pelias/ci-tools/blob/master/build-docker-images.sh
Thanks, so each repo would need:
steps:
- uses: actions/checkout@v2
+ - uses: docker/setup-qemu-action@v1
+ - uses: docker/setup-buildx-action@v1
- name: Build Docker images
...
and the script would need:
- docker build -t $tag .
- docker push $tag
+ docker buildx build --push --platform=linux/amd64,linux/arm64,linux/arm/v7 -t $tag .
Do you expect build failures in some repos? getting dependencies from official apt repositories will generally be available for arm64 (not so sure about arm v7 but probably also ok), and so adding this platform should work ootb 🤔
Is there a complete list of repos somewhere that would need a PR? 36 of them? you can also move the workflow itself to pelias/ci-tools
and call it from the children ref reusable workflows
Thanks @ddelange that was super helpful.
I tested that out with https://github.com/pelias/api/tree/arm-build and it worked just fine.
It is very likely there will be some build failures on some repos, so we'll have to start sorting through that now.
Awesome, let me know if I can provide any further support!
I would be happy to contribute and get this through: would you prefer a core contributor to take care of it, or should I just open 36 PRs? If yes, would the PRs use the feature branch from ci-tools
until further notice, like your PoC @orangejulius? Or what is your preferred order of things?
A second option is to pin to a specific commit of ci-tools
. That way the PRs could be tested, but you'd be left having to do 36 PRs everytime something changes in ci-tools
, something I guess you wanted to avoid.
A third option: the PRs could also leave the link untouched (pointing to master), as my diff above is a non-breaking change. The potentially breaking change would then only come once ci-tools
feature branch merges into master.
A fourth option: temporarily point to the feature-branch, test the PRs, once green, point back to master, merge it, and it only goes live once ci-tools
feature branch merges into master.
I tend towards number 4 but I'm curious about your thoughts!
Opening 34 PRs under option 1 or 4. Probably, https://github.com/pelias/docker-baseimage/pull/26 needs to merge (and release?) before most of them can be tested.
There's some repos that don't use github actions, I won't touch them for now (they'll break once ci-tools
feature branch merges):
Hi @ddelange, we are planning to discuss this in a team meeting today, can you please hold off any more PRs until we chat about what we'd like to do.
Hi @missinglink,
These should be all (except 3, see comment above) -- feel free to close them if you want to approach it differently!
I didn't have time to look through them yet, were there any PRs where the CI failed?
I think the PRs need CI approval from an org member, potentially even for each commit I push depending on your org settings
Realised the CI is not being triggered on my PRs because they're coming from a fork, so it's not technically a push
to the repo.
Adding the pull_request
trigger on top of the push
trigger will work. It should error for me on the pushing to dockerhub phase, because PRs coming from a fork won't have access to the base repo's github secrets.
There is also the (dangerous) pull_request_target
trigger where you'd need to bar the permissions carefully.
For PRs not from a fork, the push
event will run in parallel to the pull_request
triggered job, doing double minutes. To avoid that, a branch filter can be added to the push
trigger to only run on master
commits.
The triggering setup could be:
on:
pull_request:
push:
branches: master
# optional
release:
types: [released, prereleased] # triggers workflow using the release tag as ref
workflow_dispatch: # allows running workflow manually from the Actions tab
edit: and I think it doesn't make sense for me to push this snippet, because I'm pretty sure the modified event triggers won't go into effect when coming from a fork :')
Hi @ddelange, we are planning to discuss this in a team meeting today, can you please hold off any more PRs until we chat about what we'd like to do.
Hi @missinglink 👋
Any updates thus far?
Agh sorry I forgot to write back, the pull requests are a pain to deal with since there's so many repos, so we'd like to avoid having to do them multiple times.
As they stand they are targeting ci-tools/buildx
(a branch of ci-tools
) and I'm guessing we'd have to do them all again to switch them back to master?
I think this is the right direction to go but I'm still a little concerned that maybe one or two repos might end up being more difficult (likely anything to do with libpostal).
So what I think is a good solution is to merge a ci-tools PR first which contains the change but also has the ability to disable ARM support via env var.
That would allow us to roll it out across the board and also have the ability to disable it for any repos where there are issues.
It seems to make sense to have multi-arch builds enabled by default and optionally disabled via an env var.
Does that sound like a plan?
Looking at the code again it might make sense to have a default "platforms" string and then allow it to be overwritten by an env var as this gives a lot more granular control.
Question: is there any functional difference between docker build
and docker buildx build --platform=linux/amd64
?
the env var idea with multi-arch enabled by default sounds like a good plan! I recently took a similar approach here
I'm guessing we'd have to do them all again to switch them back to master
yeah, the rollout here is a bit iffy. I would tend to option 4 but maybe your env var suggestion opens the door to even more possibilities? 🤔
Question: is there any functional difference between
docker build
anddocker buildx build --platform=linux/amd64
?
short answer: no! buildx does couple the pushing of multi-arch manifest (so separating build
and push
instead of build --push
would be a pain, for that you'd need regctl
i think, but haven't tried) but here that's no problem
In CI there is no need for separation of build and push so we can move forward on this path, where we migrate to buildx
and include the required build dependencies without any negative impact.
Of the two options, being enabling multi-arch by default or multi-arch by config, I'm confident that multi-arch is the preferred option so would advocate it being enabled by default and disabled/adapted via config.
So the immediate next step is opening a PR on ci-tools which defines a variable with default --platform
flags and making that variable overloadable via the :-
bash convention.
Once that is merged to master we can go ahead and merge these PRs once they point to the master branch and we're done for now, with the option of testing and reconfiguring in the future with minimal effort
cc/ @orangejulius
I'm off to bed now but I can open that PR tomorrow
Hi @missinglink 👋 was there any decision about the order in which to roll this out?
I think it doesn't make sense for me to push this snippet, because I'm pretty sure the modified event triggers won't go into effect when coming from a fork :')
Does it make sense to keep all my PRs open?
Or should I close them and leave it up to the maintainers when and in which order to get this done?
Hey is there any plan to enable multiarch soon? Trying to decide if I should built the images myself or can I help somehow to get the multiarch build merged?
ARM support would be greatly appreciated!
Hi all,
I had a look over the outstanding ARM tickets today and there's a path forward with https://github.com/pelias/ci-tools/pull/11 but also a significant amount of testing and possibly some dev work involved to fully support ARM and have it stable enough to use in a production environment.
It's unfortunately not as simple as just using buildx
as the docker images contain a bunch of different tools and software out of our control which all needs to be multiarch capable and compiled accordingly.
Due to the amount of time required to set up and maintain ARM builds, I'm sorry to say Julian & I likely won't be able to take this on any time soon. I recently got a new M2 Mac so I can slowly chip away at these issues.
If you're in a position to sponsor some developer time to work on this, please email us hello@geocode.earth
Hi all, I merged https://github.com/pelias/ci-tools/pull/11 today, this is the first step to getting multi-arm builds working.
From now on, when PRs are merged in the Pelias repos it will produce a multiarch docker image which will run on both amd64
and arm64
:tada:
The exceptions are the following repos which I believe are still not capable of running on ARM because they depend on libpostal
and there are downstream issues blocking it, once those are resolved we should be able to get full ARM support.
Nice! And what about all my closed PRs linked above? Mainly the
- uses: docker/setup-qemu-action@v1
- uses: docker/setup-buildx-action@v1
addition which makes docker buildx build --platform
work?
@ddelange those don't seem to be required, we handle it all here: https://github.com/pelias/ci-tools/blob/master/build-docker-images.sh#L61-L68
I've updated the pelias/baseimage to be multiarch, so as of today any derivative images can also be multiarch.
As a test I've managed to produce multiarch builds of both pelias/elasticsearch
and pelias/polylines
.
The latter required a conditional statement in the Dockerfile
to detect which file to download, an example of this can be found in https://github.com/pelias/polylines/pull/273
I think the final hurdle is going to be libpostal, there was some progress in supporting the Mac M1 chipset but it caused a regression and IIRC was removed.
It might be as simple as detecting the architecture and using the --disable-sse2
build flag.
for libpostal that flag should indeed work, we are producing (OpenBLAS accelerated) x86_64/arm64 images for libpostal here using TARGETARCH
which is exposed by default by buildx.
@ddelange those don't seem to be required, we handle it all here: https://github.com/pelias/ci-tools/blob/master/build-docker-images.sh#L61-L68
that's good to know, seems github is now bundling qemu and buildx in the default runner. nice!
I just published pelias/elasticsearch:7.17.15
which supports multiarch.
We can optionally rebuild older versions for multiarch if requested in the future.
I managed to complete a portland-metro
build on my Macbook M2 this morning with a couple of small changes to the docker-compose.yml
file 🎉
It seems that Docker is now able to emulate AMD64 on ARM64 using Qemu (not Rosetta unfortunately), so a bunch of the images seem to 'just work', although I bet they are much slower under emulation.
In order to fully support ARM64 (for Graviton servers, Raspberry PI etc.) we will need native ARM64 support for all the images, some just need to be rebuilt in CI, some need some adjustments to work.
For reference, this command is handy, after pulling down the latest images (to grab the multiarch versions I've been building) you can run this to list what architecture docker pulled down for you:
docker image inspect --format "{{.Architecture}} {{.RepoTags}}" $(docker image ls -q -f 'reference=pelias/*')
If you're on an ARM64 machine you should see some of them prefixed with ARM64.
I also wrote a one-liner to check out available architectures and sizes on the registry before pull: https://stackoverflow.com/a/73108928/5511061
@ddelange I'm trying the workflow you have to disable SSE as per your Dockerfile but it results in an error, maybe a more recent commit of libpostal
broke it?
Does the master
branch of libpostal
work for your build?
#17 140.4 gcc: error: unrecognized command-line option ‘-mfpmath=sse’
#17 140.4 gcc: error: unrecognized command-line option ‘-mfpmath=sse’
#17 140.4 gcc: error: unrecognized command-line option ‘-msse2’
#17 140.4 gcc: error: unrecognized command-line option ‘-msse2’
#17 140.4 make[2]: *** [Makefile:3956: libpostal-strndup.o] Error 1
#17 140.4 make[2]: *** Waiting for unfinished jobs....
#17 140.4 make[2]: *** [Makefile:3970: libpostal-main.o] Error 1
#17 140.4 gcc: error: unrecognized command-line option ‘-mfpmath=sse’
#17 140.4 gcc: error: unrecognized command-line option ‘-mfpmath=sse’
#17 140.4 gcc: error: unrecognized command-line option ‘-msse2’
#17 140.4 gcc: error: unrecognized command-line option ‘-msse2’
#17 140.4 make[2]: *** [Makefile:3998: libpostal-file_utils.o] Error 1
#17 140.4 make[2]: *** [Makefile:3984: libpostal-json_encode.o] Error 1
#17 140.4 make[2]: Leaving directory '/code/libpostal/src'
#17 140.4 make[1]: *** [Makefile:464: all-recursive] Error 1
#17 140.4 make[1]: Leaving directory '/code/libpostal'
#17 140.4 make: *** [Makefile:373: all] Error 2
#17 ERROR: process "/dev/.buildkit_qemu_emulator /bin/sh -c ./bootstrap.sh && ([ \"$TARGETARCH\" == \"arm64\" ] && ./configure --datadir=\"$DATADIR\" --disable-sse2 || ./configure --datadir=\"$DATADIR\") && make -j4 && make install && ldconfig" did not complete successfully: exit code: 2
all green on our side... fwiw running on a ubuntu jammy image with latest build-essential
installed
ok cool, it took a couple of days dev work but now all images are available on ARM 🎉
I'm going to close this PR, please open individual bug reports if you have any issues.
arm64 [pelias/fuzzy-tester:master]
arm64 [pelias/transit:master]
arm64 [pelias/pip-service:master]
arm64 [pelias/placeholder:master]
arm64 [pelias/csv-importer:master]
arm64 [pelias/whosonfirst:master]
arm64 [pelias/elasticsearch:7.16.1]
arm64 [pelias/openaddresses:master]
arm64 [pelias/openstreetmap:master]
arm64 [pelias/api:master]
arm64 [pelias/interpolation:master]
arm64 [pelias/libpostal-service:latest]
arm64 [pelias/schema:master]
arm64 [pelias/elasticsearch:7.17.15]
arm64 [pelias/polylines:master]
awesome work! :boom:
Use-cases
I would like to run Pelias on a Raspberry Pi, but that would require ARM support. It doesn't look like all of the docker images support arm64. :/
Attempted Solutions
Proposal
Add ARM builds for all Docker images. :)
References