cylc / cylc-flow

Cylc: a workflow engine for cycling systems.
https://cylc.github.io
GNU General Public License v3.0
328 stars 92 forks source link

cylc review: port to Cylc 8 #5937

Open oliver-sanders opened 7 months ago

oliver-sanders commented 7 months ago

Ths cylc review utility provided us with a database-driven browser-based monitoring tool.

It did not support interactivity or live updates but proved useful for a number of cases especially providing read-only access to other people's workflows at scale (due to efficient backend), debugging (due to linkable line numbers in log files) and reviewing historical workflow data.

The plan was to replace this with the cylc-ui and the cylc review source code was removed from cylc-flow master. Unfortunately, we have not yet been able to bring the required features into cylc-ui to satisfy these use cases leaving us with a gap in functionality.

This issue proposes porting the cylc review utility to Python 3 / Cylc 8 to give cylc-ui the time required to fill in these functionality gaps.

Must:

Should:

Questions:

hjoliver commented 7 months ago

Another question (I'm expecting a "no" answer, but maybe worth asking):

Is it feasible to incorporate cylc review into the UIS in the sense of presenting a sort of "cylc review view" that is not completely integrated into the new UI as such, but is at least served by the UIS. On the upside, we could drop Question 2 (server framework).

oliver-sanders commented 7 months ago

That is absolutely possible but may leak some of the limitations of the UIS into Review, needs some thought.

One of the big problems that cylc-review solves very well is large scale anomalous access for users who don't necessarily have accounts on the system they are inspecting. E.G. we may have a large number of users monitoring a production workflow. We wouldn't want to dump that load onto the server that the production folks are using, so we would have to set up another server under another user account at which point you're halfway to an Apache deployment anyway.

oliver-sanders commented 5 months ago

Investigation:

Find out what Python 3 server frameworks can run under Apache:

Ideally we would find a modern WAGI framework and an Apache support module. If this exists, try it out with a simple example to find out how well it works.

oliver-sanders commented 3 months ago

Is it feasible to incorporate cylc review into the UIS

that is absolutely possible but may leak some of the limitations of the UIS into Review, needs some thought.

The UIS doesn't fit the Cylc Review model (single user vs multi-user, central vs distributed), however, there are also Jupyter Hub services, these run under the Hub not the UIS so should be both centralised and multi-user fitting the Cylc Review model nicely.

I'm not sure how these services are accessed so this may require a little research, hopefully a hub service could provide a public endpoint that does not require authentication. If so this would be a very nice solution that we could bundle with the Cylc Hub. The Hub user would require read-access to the relevant portions of filesystem for this to work.

Note, the Jupyter Hub service approach is also of interest to https://github.com/cylc/cylc-admin/issues/72

oliver-sanders commented 1 month ago

Running Cylc Review under Jupyter Hub (JH)

Is it feasible to incorporate cylc review into the UIS in the sense of presenting a sort of "cylc review view" that is not completely integrated into the new UI as such, but is at least served by the UIS.

UIS no, but Hub, yes. After a bit of poking:

Jupyter Hub (JH) Services

Long story short, JH services give you a proxy, but not a server:

So we could potentially run cylc review behind JH, however, it's technically the same thing as running it standalone. But there are two advantages to this approach:

  1. Both cylc-ui and cylc-review can be served from the same host:port (one fewer port to open).
  2. Both can be configured via the same jupyter_config.py file. In theory cylc-review could even be enabled by default in the Cylc UI Server configuration.

POC Service (Python 2 cylc-review)

Jupyter Configuration:

 c.JupyterHub.services = [
    {
        'name': 'cylc-review',
        'command': ['/path/to/cylc-review-launcher'],
        'url': 'http://0.0.0.0:8042/',
    }
]

Launcher script (Python 2):

 #!/usr/bin/python

import sys

# load the Cylc 7 library code
sys.path.insert(0, '/path/to/cylc-7/lib')
import cylc.review
from cylc.ws import _ws_init

# hack the log path
cylc.review.LOG_ROOT_TMPL = '~/.cylc/cylc-review'

# hack the service namespace to allow it to run under the JUPYTERHUB root URL
cylc.review.CylcReviewService.NS = 'services/cylc'

# start review in standalone mode
_ws_init(cylc.review.CylcReviewService, 8042, service_root_mode=True)

Launch Jupyter Hub as normal, then navigate to <hub-url>/services/cylc-review.

Note, a Python launcher script is only required to hack the Cylc Review code in order to allow it to be served behind the Jupyter Hub proxy. We should be able to achieve this with JH config alone.

Conclusions