SilverLineFramework / orchestrator

Runtime Orchestrator
https://docs.google.com/presentation/d/1HJaQPFMV_sUyMLoiXciZn9KVTCNXCgQ5LeNxbp_Vf2U/edit?usp=sharing
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

Adjustments to runtime and module API responses for web dashboard #90

Open hi-liang opened 2 years ago

hi-liang commented 2 years ago

Hoping to adjust the response data for following endpoints to render in the web dashboard:

api/runtimes/

This lists all runtimes, so ideally this should include all at-a-glance info columns in a table.

The following columns are shown in the CLI that are not included in this endpoint.

Modules or children field: This currently lists the uuid, name, filename for each module. The CLI does not show any module information, but a short list could possibly be rendered in the runtime list table. For instance, this could be shown as:

modulename2 (), modulename2 (), ... (5 more)

Filtering The API could support filtering by runtime UUID with query param, e.g. api/runtime/?module=<moduleuuid>

api/runtimes/{uuid}

This endpoint has more detail than the listing of all endpoints, but does not include the children or list of modules. This is needed to render the full list of modules on this runtime. Optionally this could be achieved with a separate, filtered api/modules request (see below).

api/modules/

Any additional columns that might be useful for each module, such as resource usage or alive status?

Filtering Similar to the above runtimes endpoint, if the api/modules/ endpoint supported filtering queries, this might allow for easy list of all modules sharing the code, .e.g. api/modules/?runtime=<runtimeuuid>, api/modules/?name=<namestring>

api/modules/{uuid}

Since there is only a single parent for a module instance, is it useful or relevant in a detailed module view, to see all other instances the same program elsewhere?

hi-liang commented 2 years ago

Currently non-existent entries for runtimes or modules return HTTP 200 with empty JSON object.

It would be helpful to receive an HTTP 404 instead.

nampereira commented 2 years ago

The 404 is done with #94. About the other notes, what is essential for next week ?

hi-liang commented 2 years ago

The 404 is done with #94. About the other notes, what is essential for next week ?

I was going to say also

but I see that the serializer currently filters only ALIVE runtimes and modules: https://github.com/SilverLineFramework/orchestrator/blob/16611a858138b4a3b49e93db2f4c95d557fd7e8e/orchestrator/views.py#L23-L26 I can just render a constant "ALIVE" value for each entry for now (for mock data, I'm actually just randomizing status for variety 😅).

A next-next-week question might be if those listings should not be restricted to only ALIVE.

nampereira commented 2 years ago

Children added with #95.

About status of a runtime, there is a reason we only consider active ones: runtimes were considered somewhat transient. I think some discussion about that is needed to think through the implications.

hi-liang commented 2 years ago

I get that, and modules perhaps even more transient. Perhaps changing it to include the other transition states, (Starting, Exiting, Killed?), just not "Dead"?

I'm thinking from perspective of someone who seeks more info on a previous runtime or module state... if you are expecting a runtime and/or module, and it's not listed, did it ever exist or did it just go down? Perhaps what we need is a separate interface, to query what is strictly past-event log types... metrics, event logs (stdout, if we start logging that too).

nampereira commented 2 years ago

FYI, with #96, the api/runtimes and api/modules endpoints now return a list directly. That is, instead of { runtimes: [] }, return simply the list. Same for modules.

nampereira commented 2 years ago

Okay, After discussion, the api/runtimes and api/modules endpoints: { results: [ {module} or {runtime} ], count: N, start: 0 }

Implemented in #97.

thetianshuhuang commented 2 years ago

I get that, and modules perhaps even more transient. Perhaps changing it to include the other transition states, (Starting, Exiting, Killed?), just not "Dead"?

I'm thinking from perspective of someone who seeks more info on a previous runtime or module state... if you are expecting a runtime and/or module, and it's not listed, did it ever exist or did it just go down? Perhaps what we need is a separate interface, to query what is strictly past-event log types... metrics, event logs (stdout, if we start logging that too).

The main reason I added the transition states is so that other services that keep their own records (mainly profiling, but also the command line client for scripts) can look up the status for modules/runtimes that they are tracking even if they are dead. It's intended to have the exact same behavior as before (where modules/runtimes were deleted from the database when they exited).

We definitely need to think about what kind of interface we want for listing historical modules; returning all historical modules would not work at least by default, since you could possibly have 100s of modules with the same name (but different UUID), all but one of which are dead.

hi-liang commented 2 years ago

When you terminate an instance on AWS EC2, the state transitions from "alive" to "shutting down" then remains visible as "terminated" for something like 5-10 minutes. Kubernetes can output a list of deleted pods in last hour. Would that be useful?

Otherwise every historical event or metric can just be outsourced entirely to the time series db.

thetianshuhuang commented 2 years ago

That's a good idea -- maybe something like the last 10 terminated modules/runtimes or the last 30 minutes, whichever is less? Then after that you have to go to a different API