sclorg / s2i-base-container

OpenShift base images
Apache License 2.0
86 stars 139 forks source link

[Discuss] Common s2i-base-npm layer for builder images (opinions welcome) #115

Closed hhorak closed 7 years ago

hhorak commented 7 years ago

Hereby I'd like to discuss a request to include npm as part of the python container image in a bit wider context:

https://github.com/sclorg/s2i-python-container/pull/184

Currently we have one s2i-base image (~365MB) that includes:

and it is used by languages (php, python, nodejs, ruby, perl) and webstack (passenger, varnish, nginx, httpd) images where we want to put some files to the image, and we plan to do similarly for databases (to allow extending images using a standard way).

The devel packages make sense in case of languages (to be able to compile binary extensions), but the webstack images IMO don't need them and images contain them needlessly.

With the npm request we talk about additional 40+MB. Since I think not only python, but also ruby and php developers like to install front-end Javascript using npm, so having the npm in s2i base would mean to share the 40MB in a common layer. On the other hand there might be images where npm is not necessary (e.g. perl's CPAN does not include many packages with npm install calls, at least in doc).

In the end we might benefit from more s2i-base layers, to utilize the space better and not install unnecessary packages into containers. It's worth mentioning that size of the images and how big part is shared across images matters in containers world, since with large scale environment we can easily save many GB in transmission over network.

So, one concrete proposal with approx size of images is this:

image name size content used by images
s2i-core ~183MB rhel7 base + s2i settings webstack, dbs
s2i-base ~366MB s2i-core + devel libraries perl
s2i-base-npm ~423MB s2i-base + npm python, php, ruby

If we don't find a use case for having s2i base variant with devel libraries but without npm, we might end up with just one new layer:

image name size content used by images
s2i-core ~183MB rhel7 base + s2i settings webstack, dbs
s2i-base ~423MB s2i-core + libraries + npm perl python, php, ruby

We also need to count with additional maintenance cost of having more than one variant of s2i-base image, but with the current tooling it is not that big issue IMO.

The version of the npm (and nodejs) inside the s2i-base-npm would be updated, so we would not provide any guaranties for that and only npm would be supported there, not rest of the nodejs packages.

Slightly more complex solution for this npm issue would be using non-builder images for runtime and s2i-base-npm only for building, as documented at https://github.com/openshift/source-to-image/blob/master/docs/runtime_image.md. That would probably mean to have two variants of images for every ecosystem (e.g. for python one with npm+pip, second with runtime only). Although this might utilize the space the best, I'd say it would make it even more complicated.

hhorak commented 7 years ago

@bparees @jsvgoncalves @GrahamDumpleton FYI.. If you happen to have any experiences with breaking down container images into more layers like this, I'd be glad if you can share.

jsvgoncalves commented 7 years ago

@hhorak So we are currently discussing if we want to have npm on s2i-base or on a new layer, s2i-base-npm? I personally don't have experience with perl for web development, I imagine it's also likely that they can use npm, though.

I would probably avoid using the non-builder images as it looks indeed complicating it more.

torsava commented 7 years ago

@jsvgoncalves A comment by Petr Pisar from a downstream discussion:

There are only 28 "npm install" strings on CPAN http://grep.cpan.me/?q=npm+install and most of them is documentation. Not a code. There is also Perl implementation of Artifactory client http://search.cpan.org/~syagi/Artifactory-Client/ (not used by anything on CPAN). And a simple npm wrapper for uploading tar balls to NPM registry http://search.cpan.org/~nplatonov/Dist-Zilla-Plugin-Web-0.0.10/lib/Dist/Zilla/Plugin/Web/NPM/Publish.pm.

So I think my guess about Perl being unrelated to NPM is quite correct.

So for the Perl image an additional npm layer would probably be mostly useless. However, having 3 layers instead of 2 will incur a higher maintenance cost.

jsvgoncalves commented 7 years ago

@torsava let me just point out that the results' order of magnitude for the same query on PyPi is equivalent (https://pypi.python.org/pypi?%3Aaction=search&term=%22npm+install%22). So, I don't see the relevance of doing such query, or you probably would end up with the same conclusion for all other languages.

Again, my RFC concerns web apps that our clients will potentially develop, which have more complex ecosystems. As opposed to modules/packages developed for the specific language, as those will obviously fall into their own package distribution mechanisms and their deployment on OpenShift is something I have some difficulties picturing as a standard use case.

torsava commented 7 years ago

@jsvgoncalves Your pypi query only searches through the name, summary, keywords and description of packages, whereas http://grep.cpan.me/ is grepping actual code. Are you saying you're of the opinion that even Perl s2i image would benefit from the shared npm layer?

jsvgoncalves commented 7 years ago

@torsava my point was that looking for occurrences of npm install on the packages is totally out of the scope of the problem at hand. I advocate that we should include npm on the images that are used for web development, and perl as its own share too. I don't have concrete numbers on anything relating the images from sclorg, though.

hhorak commented 7 years ago

@ppisar What do you think, would you be against having npm in perl image even in case it might not be used in many cases?

ppisar commented 7 years ago

I'm not against. In my opinion, the s2i-base is already so big that adding another 16 per cent does not matter.

hhorak commented 7 years ago

So, it looks like we might be fine with having just the s2i base variant with devel libraries and npm, plus the s2i-core that would only include the s2i settings:

image name size content used by images
s2i-core ~183MB rhel7 base + s2i settings webstack, databases
s2i-base ~423MB s2i-core + libraries + npm perl, python, php, ruby, nodejs
andrewklau commented 7 years ago

What was there any feedback about having a builder and runtime image?

Having a single builder image would mean the runtime images could become significantly smaller which is very useful considering that the runtime to builder ratio should be high.

I believe Travis has a setup similar to a base builder image which handles most use cases, this would really promote the use case for dedicated build clusters.

A good example would be, if Golang was ever supported, pretty much all of it's build components can be stripped for the runtime image.

torsava commented 7 years ago

@andrewklau So far we're only supporting interpreted languages where I'd say the builder/runtime size doesn't significantly vary in most cases. Until Golang or similar is added the cost/benefit ration does not appear favourable to me.

omron93 commented 7 years ago

PR to implement s2i-core and s2i-base images - https://github.com/sclorg/s2i-base-container/pull/121

pkubatrh commented 7 years ago

Since #121 has been merged already I think this can be closed.