pelias / polylines

Pelias import pipeline for polyline (road network) data.
MIT License
17 stars 25 forks source link

Add explicit `gcc` dependency in Dockerfile #262

Closed orangejulius closed 2 years ago

orangejulius commented 2 years ago

In https://github.com/pelias/docker-baseimage/pull/23 we're leaning out our Docker baseimage used by all other Pelias images, and hopefully can remove the compiler toolchain all together.

The Polylines Docker images do need gcc (but not quite a full compiler toolchain), but didn't follow the convention in our other Dockerfiles of having an apt-get step to install it.

This PR adds such a step, and is a little clever in installing gcc only temporarily, and only for the go get step that requires it.

The polylines Docker image is already quite large (950MB uncompressed) since it includes Node.js, Go, and package dependencies for both. Skipping the installation of gcc cuts out 120MB of that.

Until https://github.com/pelias/docker-baseimage/pull/23 this change won't really have any impact on the size or operation of this Docker image.

missinglink commented 2 years ago

This should be fine, although I'm not convinced the cleanup is removing everything that was installed.

I believe apt remove doesn't delete some files which --purge does, maybe only user config files? I think the apt cache is also not being removed.

My preference for this sort of thing would be a multi-stage docker build where the binary is generated in an isolated stage and then just copied into the next stage.

eg: https://github.com/missinglink/pbf/blob/master/Dockerfile#L14

missinglink commented 2 years ago

We could also drop the Go dependency in the final docker image since the go runtime is compiled into the binary?

orangejulius commented 2 years ago

Oh yeah, a multistage build would work really well here. I'll take a look at that :)

missinglink commented 2 years ago

I don't remember exactly what requires CGO in pbf but it's probably the SQLite binding, LevelDB binding etc.

In that case you'll need to have the headers installed at compile time, so like libsqlite, sqlite-dev or whatever it's called on Ubu.

They are dynamically linked by default so the runtime image will need a compatible version of the shared link library (.so on Linux, .dylib on Mac), these often have a similarly named apt package which can be much smaller since there is no source included.

missinglink commented 2 years ago

If you have any issues with shared libs you can run ldd on the compiled binary to list all dynamically linked libs.

orangejulius commented 2 years ago

Cool, I think shared libraries will be fine. I confirmed by looking at htop that it is something sqlite3 related that's building with gcc, and I don't think we use that functionality in the polylines importer. I'll test out a pelias prepare polylines but it should work, right?

missinglink commented 2 years ago

I think it'll be fine since the baseimage will have libsqlite.so in it already anyway.

I'm actually not sure off the top of my head what would happen if that file didn't exist in the runtime docker image.

I'd assume it wouldn't even start the binary rather than erroring only on SQLite related functionality.

orangejulius commented 2 years ago

Yup, it all worked out fine: https://github.com/pelias/polylines/pull/263