pelias / openaddresses

Pelias import pipeline for OpenAddresses.
MIT License
51 stars 43 forks source link

Improve Dockerfile to reduce image size #514

Closed tdurieux closed 1 year ago

tdurieux commented 1 year ago

Hi there,

I've made a small improvement to the Dockerfile that I think could help optimize the image size.

Summary of the changes:

Impact on the image size:

I hope that you will find these changes useful to you. Let me know if you have any questions or concerns.

Thanks,

missinglink commented 1 year ago

Thanks!

orangejulius commented 1 year ago

Hi @tdurieux, thanks for this nice little change. Do you know if it's possible that there are configuration values to tell apt to never install recommended packages, and perhaps likewise to tell NPM to not even use the cache?

If so, we can use those settings in our base Dockerfile, and all of Pelias will benefit from it. Otherwise we might want to consider including these changes in most of our other Dockerfiles.

tdurieux commented 1 year ago

Hi @orangejulius,

As far as I know, it is not possible to configure npm and apt to have this behavior. We always have to remove the cache manually and ask apt to not install the optional packages, like you also have to call manually rm -rf /var/lib/apt/lists/*.

It is unfortunate. I would be happy to look at your other Dockerfiles and see if we can reduce their size too.

missinglink commented 1 year ago

Looks like we may be able to add an /etc/apt/apt.conf file to the baseimage 🤔 https://askubuntu.com/a/179089

missinglink commented 1 year ago

I'm still not clear on what exactly 'recommends' are?

In the context of a fresh OS install I can imagine some things being recommended to build a functional distro, such as the terminal GUI app, but when installing specific packages I don't really understand what 'recommends' are exactly, is that stuff like man pages or what?

missinglink commented 1 year ago

Recommends This declares a strong, but not absolute, dependency.

The Recommends field should list packages that would be found together with this one in all but unusual installations.

Seems that setting 'recommends' for a dependency is a choice for the packager, so the effect varies between packages, in some cases it might cause the binaries to lack certain features.

https://unix.stackexchange.com/a/77076

missinglink commented 1 year ago

@tdurieux do you possibly know how much disk space was saved by each of the two changes you made? I'm curious how much the change to apt made exactly.

missinglink commented 1 year ago

I'm guessing that, since the two packages don't have any 'recommends' that this is a no-op 🤔

Screenshot 2023-01-11 at 15 51 03
tdurieux commented 1 year ago

I did not check, I will check that tomorrow. I will also try to identify which packages are not installed.

Since openaddresses has a "lot" of node dependencies it is possible that the saving only comes from npm cache clean -f. By default, npm will first add the packages and dependencies in the local npm cache folder (.npm) and then copy it to the node_modules folder. Not useful in a docker image...

I realized that we could also use npm install --omit=dev to avoid the devDependencies and save a bit more space. I will do some tests tomorrow.