bollwyvl commented 9 years ago

Based on this discussion about IHaskell, there is a need for a reproducible way for a kernel author to make a new kernel available on try.jupyter.org.

Full Proposal

Not sure if this is the best way to go, but let's try doing fork/link for updates?

tl; dr:

use try.jupyter.org or a new kernels.jupyter.org as a kernel registry/marketing site
- mockup
streamline the PR process for adding new kernels to try.jupyter.org and/or kernels.jupyter.org
- metadata
- Docker
- automated tests
- docker-compose

Thoughts?

cc: @rgbkrk @Carreau @gibiansky @dsblank

dsblank commented 9 years ago

@bollwyvl I think this makes a great deal of sense, and it is good to start thinking about this sooner, than later. I have no experience with docker, so I'll be a good person to experiment on :) I hope that the learning curve won't be too much of a barrier for kernel producers...

I also think that there needs to be some guides for the user to try these out... the landscapes of different kernels could be vast and wide in terms of variability:

how come I don't have magic X?
what kernels share what items (libraries, magics, etc)?
where to get help for this kernel?
how to get command completions? help? shell?

Definitively need some quality control, too. And a way for users to easily give feedback, make comments, raise issues, etc. It is possible that kernels won't pass some quality control measures, or that they bitrot over time and have to be removed; or added over time because they get better.

bollwyvl commented 9 years ago

@dsblank as a prolific kernel implementer, you are definitely a target for the Kernel Developer user story. Hopefully I address some of your concerns below: let me know if you think more text in one of the areas would help tell the story...

I have no experience with docker, so I'll be a good person to experiment on :) I hope that the learning curve won't be too much of a barrier for kernel producers...

Right. The Dockerfile/docker-compose.yml should be extremely formulaic once we've got a few of them together. If you don't even want to do Docker locally, we can abuse the CI environment to get builds out. But there isn't another viable way right now, in my mind, to solve this problem at scale!

The actual Dockerfile shouldn't be much more complicated than the sample. Basically:

handle apt-get dependencies to get your target language/package manager
use the package manager (or git) to install your kernel and desired ride-along packages to make it awesome

I think you will have to wrap your head around the volume concept, as we'll use the hell out of that... unless we use ZMQ between kernel/dashboard... which is very interesting. But all of this learning would be done in the machine-assisted context of a PR, which is the most productive kind of collaboration we have :)

how come I don't have magic X?

what kernels share what items (libraries, magics, etc)?

how to get command completions? help? shell?

Yes, the features a kernel offers can/should be captured with tests. I think one would have to implement start, for example, but everything else is just gravy.... as long as the test is simpler than the implementation :)

where to get help for this kernel?

There is/was already a place to hook into the help menu in the machine-focused kernelspec for this, but I don't know if it actually is implemented. But i think several kinds of doc are appropriate, both at the kernel and language level: issues, mailing lists, etc.

Definitively need some quality control, too.

Automated tests are our only hope, IMHO.

And a way for users to easily give feedback, make comments, raise issues, etc.

Again, provide as metadata, then figure out the right way to make this available during the editing process. Because of the transient environment, we'll need some other way (save as anonymous gist?) for users to store the notebook in question where the issue arose.

It is possible that kernels won't pass some quality control measures, or that they bitrot over time and have to be removed; or added over time because they get better.

This is the rub, for sure. I think if a kernel doesn't build, it gets taken out of the registry, or is relegated to some second-page location for "experimental" kernels.

Docker gives us a lot of this: once a container is built, with all its dependencies, etc. it should work until the end of time, and if nobody changes its supporting files, it won't be rebuilt. The issue arises in our kernel baseline image, if there is such a thing... updating that should trigger a new build of everything.

Triggering a new build for a specific container when the kernel/target language changes is probably still best done with a one-liner PR, for example incrementing an environment variable in the dockerfile, or better yet, a single change to the kernel.json or try.kernel.json.

rgbkrk commented 9 years ago

This is a really wonderful set of specs. We ended up talking about this today and had a few pertinent points:

Kernels Site

The kernel mockup is AMAZING. Really great work. Would you mind making a PR to make that show up at jupyter.org/kernels?

Docker containers

Exposing kernels in Docker containers standalone gets into some hairy bits, namely that

Additional orchestration would be needed to hook multiple containers together for each user. Coupling that with our spawn pools and everything else would make that a much harder problem than our single container setup (not that it's impossible, just that engineering effort has to go towards tmpnb)
files would not be the same across kernels
separate docker images would actually require more space since it would include each distro of linux

I'd prefer to keep it all in one Dockerfile.

keeps it as one flow that is a useful reference item for installing kernels
size of the image is shared amongst all the users, across one base layer

Docker compose

It's about time we make the installation setup for tmpnb a bit easier. Putting a compose file in docker-demo-images would really help, as would putting one in tmpnb.

Carreau commented 9 years ago

The kernel mockup is AMAZING. Really great work. Would you mind making a PR to make that show up at jupyter.org/kernels?

+1

bollwyvl commented 9 years ago

We ended up talking about this today and had a few pertinent points:

Yeah, keep meaning to try to get to the meetings... the :baby: has different ideas though :)

Would you mind making a PR to make that show up at jupyter.org/kernels?

I'll get on it!

I'd prefer to keep it all in one Dockerfile.

Okay, sounds fine for now... but I think the issues will be worth resolving eventually to support an "app store" like experience for kernels, but I don't yet have the chops to do the needed engineering to tmpnb myself!

Sort of related, I have been dinking around with alpine linux, and found this recipe for using miniconda with alpine... without any optimization, installing up to ipython, numpy and pandas yields:

│           ├─d67fc2aeacb4 Virtual Size: 237.6 MB Tags: minikernel_minikernel:latest

That's a pretty nice baseline, and could possibly be smaller... i think the miniconda install script hangs around!

Putting a compose file in docker-demo-images would really help, as would putting one in tmpnb.

Yeah, I love compose: a docker-compose.yml is the best documentation ever... for someone who reads dockerfiles! In making more use of it, I see two issues:

there is no registry for "compositions" like there is for images... they recently added the ability to extend a service (haven't used it yet), but I imagine a composition-level API will show up eventually...
there is no Travis-CI-grade CI service out there that supports compose for open source. CircleCI is close, but limits the number of containers... which might meet our needs if we are going with the monolith

rgbkrk commented 9 years ago

The difference of 5.0 MB -> 125 MB is 120 MB. Relative to what miniconda, the scipy + PyData stack, R and Julia add on (~2.7 GB), that 120 MB is nothing. I'm not real worried about getting set up on alpine linux, as you'll find out you're missing what you need for a normal development and analytic environment (git, certificates, build tools, etc.) I've set miniconda up with alpine linux before but it was not the most pleasant environment to do any work inside the notebook (or the terminal for that matter).

Yeah, keep meaning to try to get to the meetings... the :baby: has different ideas though :)

Hey, that's ok. Babies are only so little for so long. :wink:

I would like to get to the point of an app store like experience for kernels and extensions too. This is a good start and we can keep on planning.

jupyter / try.jupyter.org

Provide modular approach to accepting, displaying, launching new kernels #7

Full Proposal

tl; dr:

Kernels Site

Docker containers

Docker compose