jupyterhub / repo2docker

Turn repositories into Jupyter-enabled Docker images
https://repo2docker.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.61k stars 360 forks source link

Explore CNCF v3 buildpacks #720

Open yuvipanda opened 5 years ago

yuvipanda commented 5 years ago

From @jchesterpivotal in https://github.com/jupyter/repo2docker/issues/707#issuecomment-505904267

By way of warning, what follows is hilariously biased: I've several times worked on two generations of buildpack technology over the past 5 years. Pride makes me defensive.

As it was related to me by a Red Hatter I asked, s2i was created largely because the previous generations of buildpack lifecycles from Heroku (v2a) and Cloud Foundry (v2b) were optimised to a rootfs+tarball target (Heroku's term is "slug", Cloud Foundry's is "droplet"). That was considered unsuitable for OpenShift v3, which was an image-centric architecture.

Whereas Heroku and Cloud Foundry would meet you at code and hid the underlying container infrastructure, OpenShift would meet you at the image, so the latter (this is a personal opinion) had a business need for something like buildpacks to reduce the convenience gap.

But s2i never really found a home outside of OpenShift, while buildpacks have flourished in two massive, independent but genetically-related ecosystems.

Critically, the emergence of the v2 registry API enables features (particularly layer rebasing) that were previously impossible. In addition Google's Container Tools team developed and maintain the google-gocontainerregistry library which allows us to perform construction and rebasing operations with or without the docker daemon. The design of CNBs takes full advantage of both of these advances.

By way of speed improvements: We have observed some Java rebuilds drop from minutes to milliseconds. We expect large-cluster rollouts to drop from dozens of hours to potentially minutes.

Edit: I should add, your reasons for moving off s2i would apply to v2a and v2b buildpack lifecycles as well. One of the motivating problems faced by both Pivotal and Heroku has been exactly this sort of combinatorial explosion; CNBs are designed to make it possible to more easily compose buildpacks developed independently of one another.

I've bolded the bits that I think are most relevant to us. It would be great if someone could take a look at https://buildpacks.io to see if we can base repo2docker off v3 of buildpacks. http://words.yuvi.in/post/why-not-s2i/ contains reasons why we moved off s2i (which is similar to v2 of buildpacks).

A useful test case would be to try to make:

  1. A buildpack for environment.yml
  2. A buildpack for install.R
  3. A buildpack for postBuild
  4. A buildpack for apt.txt

And then see how easy / hard it is to have a repo with any combination of these 4 files produce one single image. My rudimentary math skills tell me that there's 4! possible combinations here (24), and we shouldn't have to write more than 4 buildpacks...

betatim commented 5 years ago

Is there a list somewhere of existing buildpacks and their implementation? I had a quick look at https://buildpacks.io/docs/ but couldn't find that list.

yuvipanda commented 5 years ago

From buildpacks.slack.com I found https://github.com/cloudfoundry/conda-cnb

consideRatio commented 5 years ago

Wow this was exciting stuff @yuvipanda ! :heart: :tada:

zmackie commented 5 years ago

https://github.com/buildpack/pack is probably the best way to get started playing around locally. (Disclosure I'm the "anchor" (tech lead) on the pivotal buildpacks team). CNBs are, for now, generally consumed through a builder image (what that is is not especially important for the purpose of this comment). You can see the "blessed" builder images by running

# pack suggest-builders
Suggested builders:
    Cloud Foundry:     cloudfoundry/cnb:bionic         Ubuntu bionic base image with buildpacks for Java, NodeJS and Golang
    Cloud Foundry:     cloudfoundry/cnb:cflinuxfs3     cflinuxfs3 base image with buildpacks for Java, NodeJS, Python, Golang, PHP, HTTPD and NGINX
    Heroku:            heroku/buildpacks:18            heroku-18 base image with buildpacks for Ruby, Java, Node.js, Python, Golang, & PHP

If you inspect our cflinuxfs3 builder image, for example, you can get an idea of the CNBs that we've published that are "ready", although we have a bunch more in flight:

# pack inspect-builder cloudfoundry/cnb:cflinuxfs3
Inspecting builder: cloudfoundry/cnb:cflinuxfs3

Remote
------

Description: cflinuxfs3 base image with buildpacks for Java, NodeJS, Python, Golang, PHP, HTTPD and NGINX

Stack: org.cloudfoundry.stacks.cflinuxfs3

Lifecycle Version: 0.4.0

Run Images:
  cloudfoundry/run:full-cnb

Buildpacks:
  ID                                                VERSION           LATEST
  org.cloudfoundry.node-engine                      0.0.46            true
  org.cloudfoundry.npm                              0.0.29            true
  org.cloudfoundry.yarn                             0.0.24            true
  org.cloudfoundry.python                           0.0.20            true
  org.cloudfoundry.pip                              0.0.20            true
  org.cloudfoundry.pipenv                           0.0.14            true
  org.cloudfoundry.conda                            0.0.13            true
  org.cloudfoundry.go-compiler                      0.0.20            true
  org.cloudfoundry.go-mod                           0.0.18            true
  org.cloudfoundry.dep                              0.0.17            true
  org.cloudfoundry.php-dist                         0.0.28            true
  org.cloudfoundry.php-composer                     0.0.16            true
  org.cloudfoundry.httpd                            0.0.18            true
  org.cloudfoundry.nginx                            0.0.20            true
  org.cloudfoundry.php-web                          0.0.19            true
  org.cloudfoundry.openjdk                          1.0.0-RC03        true
  org.cloudfoundry.buildsystem                      1.0.0-RC03        true
  org.cloudfoundry.jvmapplication                   1.0.0-RC03        true
  org.cloudfoundry.azureapplicationinsights         1.0.0-RC03        true
  org.cloudfoundry.debug                            1.0.0-RC03        true
  org.cloudfoundry.googlestackdriver                1.0.0-RC03        true
  org.cloudfoundry.jmx                              1.0.0-RC03        true
  org.cloudfoundry.procfile                         1.0.0-RC03        true
  org.cloudfoundry.dotnet-core-conf                 0.0.20            true
  org.cloudfoundry.archiveexpanding                 1.0.0-RC03        true
  org.cloudfoundry.tomcat                           1.0.0-RC03        true
  org.cloudfoundry.jdbc                             1.0.0-RC03        true
  org.cloudfoundry.springautoreconfiguration        1.0.0-RC03        true
  org.cloudfoundry.springboot                       1.0.0-RC03        true
  org.cloudfoundry.distzip                          1.0.0-RC03        true
zmackie commented 5 years ago

I should further add that our slack https://slack.buildpacks.io/ is a great place to chat about all things buildpacks and https://hub.docker.com/r/cloudfoundry/cnb is a decent overview with links to a bunch of the CNBs.

zmackie commented 5 years ago

I'll dig in a bit more because I think this is a really cool project:

A buildpack for environment.yml

https://github.com/cloudfoundry/conda-cnb

A buildpack for install.R

We have not written an R cnb but its somewhere in our backlog. a julia cnb is slightly higher in priority.

A buildpack for postBuild

We don't have this written, but I don't think it would be massively hard to implement

A buildpack for apt.txt

This will possibly be handled by https://github.com/buildpack/rfcs/pull/23

ltalirz commented 4 years ago

In case it's useful, here is a link to a recent post on the topic by Jose Diaz-Gonzalez, the lead developer of dokku, including some notes on how CNB tech differs between cloudfoundry, heroku and herokuish https://dokku.github.io/technology/comparing-buildpack-v3-to-herokuish

manics commented 4 years ago

Is this something that could be used for new repo2docker buildpacks, e.g. Java? https://github.com/jupyter/repo2docker/issues/780

samj1912 commented 3 years ago

Hello :wave: I am a maintainer on the Cloud Native Buildpacks project. Happy to answer questions and provide support if y'all decide to go forward with CNB support :)

Just some links that might be useful in this context. You can also write CNBs in Python (if that reduces the integration barrier here). Here are some example buildpacks written in python -

https://github.com/samj1912/proc-descriptor-buildpack/blob/main/main.py https://github.com/samj1912/runtime-env-descriptor-buildpack/blob/main/main.py https://github.com/samj1912/label-descriptor-buildpack/blob/main/main.py

You can take these small pieces and compose them into a "meta-buildpack" if you want to which allows you to alias the combination of the above buildpacks in a simple-to-use wrapper - https://github.com/samj1912/project-descriptor-buildpack (this is mostly just a shell with an order file https://github.com/samj1912/project-descriptor-buildpack/blob/0978e93ab8e417a63baae9f9092c646b71685518/buildpack.toml#L14 (Note that all buildpacks here are optional so you could have 2**3 -1 valid combinations here which are automatically handled)

Link to CNB Python bindings library that was used to create these - https://github.com/samj1912/python-libcnb

Link to the official golang CNB bindings - https://github.com/buildpacks/libcnb

Our katacoda tutorials to help you get off the ground quickly (fully set up with CNB tools and an interactive walkthrough on creating a simple buildpack in bash) -

https://katacoda.com/buildpacks


EDIT - Here is a simple postBuild buildpack - https://github.com/samj1912/postbuild-buildpack/blob/main/main.py

meeseeksmachine commented 3 years ago

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/building-a-the-littlest-binderhub/9824/8