nerc-project / operations

Issues related to the operation of the NERC OpenShift environment
1 stars 0 forks source link

Redirect links to stopped notebooks #648

Open larsks opened 1 month ago

larsks commented 1 month ago

Hi @larsks! We (@Isaiah Stapleton @Jonathan Appavoo @meera) have a question that we hope you can help us with. We want to solve an issue: when a student user enters a link to a stopped JupyterLab instance, such as https://test-image-1-ope-rhods-testing-1fef2f.apps.shift.nerc.mghpcc.org/notebook/ope-rhods-testing-1fef2f/test-image-1/lab, it results in an Application is not available page. Is there a way to redirect users to a page where they can restart the instance instead, in our case, this page: https://rhods-dashboard-redhat-ods-applications.apps.shift.nerc.mghpcc.org/projects/ope-rhods-testing-1fef2f.

larsks commented 1 month ago

@DanNiESh

There's not going to be an easy way to do this.

The notebook address (test-image-1-ope-rhods-testing-1fef2f.apps.shift.nerc.mghpcc.org) is managed by an OpenShift Route, and Routes have very simple behavior:

I don't think there's a simple way to modify this behavior. The best option would be to modify the code responsible for the JupyterLab/OpenShift integration such that it would create a simple "redirector" application when a notebook is stopped.

If we wanted to achieve the desired behavior ourselves, we might be able to write a controller that would:

  1. Get a list of notebooks in the namespace
  2. For each notebook, see if any pods are running
  3. If there are no pods running, create a redirector pod that matches the associated service selector
  4. Repeat forever:
    1. Watch for pod create/delete events
    2. When a pod is created, figure out what notebook it belongs to and delete the redirector pod
    3. When a pod is deleted, see if there are any pods left for that notebook, and if not, create the redirector pod
DanNiESh commented 1 month ago

Also reached out to slack #forum-openshift-ai to see if we can make a feature request to RHOAI team.

DanNiESh commented 1 month ago

Update: RHOAI team has created an issue for us: https://issues.redhat.com/browse/RHOAIRFE-296. They also suggested us to try this: https://docs.openshift.com/container-platform/4.16/networking/ingress-operator.html#nw-customize-ingress-error-pages_configuring-ingress

larsks commented 1 month ago

I don't think that customizing the ingress operator will be a great option because it is a cluster-wide configuration. I guess we could have it include links to some documentation.

joachimweyl commented 2 days ago

@DanNiESh can you give an update on the status of this issue?