Jooho / pachyderm-operator-manifests

0 stars 1 forks source link

Q: How would one go about automating this example/test? #5

Open jiridanek opened 11 months ago

jiridanek commented 11 months ago

@jstourac Having gone through this pachyderm demo for edge detection by hand, I am now wondering how one would automate this.

I imagine first step might be to create an autopause waiting for the pachyderm pipeline pod to spawn into the notebook.

The notebook pod does not have permissions to talk to Kubernetes API

! oc get pods -n pachyderm
Error from server (Forbidden): pods is forbidden: User "system:serviceaccount:rhods-notebooks:jupyter-nb-htpasswd-2dcluster-2dadmin-2duser" cannot list resource "pods" in API group "" in the namespace "pachyderm"

So I guess one should use the pachctl to query for it

# Check if a new repo `edges` that has the changed image after pipeline created
!pachctl list repo

(or something like that, in a loop).

Also, looking at the docs, I haven't found a way to for example create OpenDataHub project and start a Jupyter Notebook server using the oc command.

I did find the API to subsequently interact with Jupyter Notebook once it is running, https://github.com/jupyter/jupyter/wiki/Jupyter-Notebook-Server-API#Notebook-and-file-contents-API and https://jupyter-server.readthedocs.io/en/latest/developers/rest-api.html. Relevant SO question at https://stackoverflow.com/questions/54475896/interact-with-jupyter-notebooks-via-api.

Sadly, it does not appear I can do something like "evaluate every cell in this notebook one by one". The SO answer talks to the kernel directly...

There are various tools that can evaluate a jupyter notebook file (and render it as html, etc.), but they apparently talk to a local jupyter notebook. Maybe there is an option to have them talk to a remotely running one?

The nbclient from above looks promising.

jiridanek commented 11 months ago

Actually, it should be enough to install pachyderm and talk to it from pachyderm CTL from inside the notebook image. I don't need to do this though Jupiter notebook, in a test I can just spawn pod from the image and exec commands in it.

This would greatly simplify what needs to be automated.