codecheckers / discussion

General discussions and questions
0 stars 0 forks source link

CODECHECK infrastructure #2

Open nuest opened 4 years ago

nuest commented 4 years ago

The assistant is a first step in streamlining code checks. A further step would be online infrastructure that codecheckers can use. Let's note ideas here what this infrastructure could do, what benefits are, what limitations exists, etc.

nuest commented 4 years ago
nuest commented 4 years ago

Ideas for Integration with a BinderHub

A BinderHub could allow codecheckers to do everything online:


Running assisstant and automated processes:

Maybe we can call a docker exec via a new API endpoint in BinderHub? The exec would execute a binary (to be implemented) that reads a codecheck.yml (finds the files to check), does the check, and then saves the result (with a "signature") back to the codecheck.yml (adding to potentially existing checks). We would manipulate the user container/pod from the outside... and since we would not tell the user about the pod, they also cannot manipulate it. But it would be exactly the same environment they have (i.e. a JupyterHub session/container) that they get when they start the binder for interactive use. We would run the analysis actually in a pod, and not hacked into the r2d build process. Since BinderHub started the JupyterHub, we (should) know the pod and container, so we can also save the container (hopefully) to a file.


Questions

How does a user "explore" the result of a check?

Do we need JupyterHub? https://github.com/pangeo-data/pangeo-stacks/pull/10 adds a verify script which now is called as part of build.py, but that's more to verify the image works OK, not to actually run the analysis... BUT docker run -i -t ${IMAGE_NAME} "binder/verify" seems like something that we also want to run.

Can we assume a check happens automatically after a binder is started, or is it triggered (potentially allowing manipulation of content) manually? > Should start right after successful build and launch of the pod/container (?) by JupyterHub ?!

Do we need a special UI? > Instead of getting the notebook, you just get a UI that runs the code configured in codecheck.yml within the container, then triggers the check, and puts the "Docker image + check result + badge" somewhere safe.

Alternative: enhance with Jupyter extension, see ideas/discussion at https://github.com/jupyterhub/binderhub/issues/579 and https://github.com/jupyterhub/binderhub/issues/674

Does BinderHub keep a list of redirects it makes for users to JupyterHub (so it could execute stuff in user containers) ?

How can we export the image to a file and make it available for download? Would that work via a new BinderHub API endpoint?

Is it feasible to run both binder and a binder-fork in the same k8s cluster, using the same Jupyter Hub?


JupyterHub API: https://jupyterhub.readthedocs.io/en/stable/api/ Turing-Way book chapter about BinderHub: https://github.com/alan-turing-institute/the-turing-way/pull/557/files?short_path=bfcf303#diff-bfcf303fc9ba83d09c678a89644c2565


We should try to extend this figure to make CODECHECK's process clear, from https://binderhub.readthedocs.io/en/latest/overview.html

image

nuest commented 4 years ago

We could also have a bot to streamline organisation of the review process. JOSS's whedon or buffy could be a basis: https://github.com/openjournals/buffy