ITISFoundation / osparc-simcore

🐼 osparc-simcore simulation framework
https://osparc.io
MIT License
46 stars 27 forks source link

Enhancement: Display statistics of computational jobs together with their parent nodes #5816

Closed sanderegg closed 3 weeks ago

sanderegg commented 5 months ago

Context

Since the public API is available it is now possible to run X computational jobs from a running dynamic service. Looking at the Usage statistics of a user it currently displays computational jobs separated from their "parent" dynamic service.

Goal

Display the statistics of computational jobs linked to their parent service.

Needed changes

  1. Important note: we should keep backward compatibility of the API, and also for sim4life.io keep the option to pass it via metadata at least for awhile.
  2. modify the oSparc API to create/run computational jobs from a running dynamic service by passing the "parent" node ID (ideally automatically defined, if not then the API shall be modified) - use-case: sim4life, meta-modeling, jupyterlabs, ...
  3. the parent Node ID is passed all the way to the computational backend (already exists, needs to be modified based on 1.)
  4. using the parent NodeID the logs are sent back to the parent project/nodeID if it exists (already exists, needs to be modified based on 1.)
  5. the resource usage tracker shall keep track of the parent node ID if it exists
  6. the frontend shall display the usage with services and their children jobs
### Tasks
- [ ] https://github.com/ITISFoundation/osparc-simcore/issues/5950
- [ ] https://github.com/ITISFoundation/osparc-issues/issues/1618
- [ ] https://github.com/ITISFoundation/osparc-simcore/pull/5874
- [ ] https://github.com/ITISFoundation/osparc-simcore/issues/5878
- [ ] https://github.com/ITISFoundation/osparc-simcore/pull/5877
- [ ] https://github.com/ITISFoundation/osparc-simcore/issues/5879
- [ ] https://github.com/ITISFoundation/osparc-simcore/issues/5881
- [ ] https://github.com/ITISFoundation/osparc-simcore/issues/5925
- [ ] https://github.com/ITISFoundation/osparc-simcore/pull/5966
- [ ] https://github.com/ITISFoundation/osparc-issues/issues/1517
sanderegg commented 5 months ago

After discussion with @bisgaard-itis :

proposal to modify the osparc python client:

mguidon commented 5 months ago

So this is to avoid having it in the not-validated metadata?

sanderegg commented 5 months ago

So this is to avoid having it in the not-validated metadata?

As discussed, no. This is for generalization of this usage and to ensure we always get that info so that the billing center looks nice.

As discussed as well, both ways (the sim4life.io way and the new one should work, at least for awhile)

bisgaard-itis commented 5 months ago

After discussion with @bisgaard-itis :

proposal to modify the osparc python client:

* modify the API call to create a computational job to get an optional header containing at least the parent node ID

* based on ENV `OSPARC_NODE_ID` and possibly `OSPARC_STUDY_ID` variables set in the dynamic service,

* the client can automatically fill in the headers
  -> Users that are using the python client in their code will get that feature for free

After thinking a bit more about this I have the following modified proposal: Since this approach is based on the client "picking up" the node_id and sending it to the api-server I suggest to simply overwrite the create_solver_job method in the osparc python client, so that it first calls the endpoint on the api-server to create the job and afterwards calls the patch endpoint with the metadata picked up from the environment variables. That way we will not have to modify anything on the server, so any existing functionality will continue to work and we simply "package" the endpoints into user-friendly functions on the client side.

sanderegg commented 5 months ago

After discussion with @bisgaard-itis :

proposal to modify the osparc python client:

* modify the API call to create a computational job to get an optional header containing at least the parent node ID

* based on ENV `OSPARC_NODE_ID` and possibly `OSPARC_STUDY_ID` variables set in the dynamic service,

* the client can automatically fill in the headers
  -> Users that are using the python client in their code will get that feature for free

After thinking a bit more about this I have the following modified proposal: Since this approach is based on the client "picking up" the node_id and sending it to the api-server I suggest to simply overwrite the create_solver_job method in the osparc python client, so that it first calls the endpoint on the api-server to create the job and afterwards calls the patch endpoint with the metadata picked up from the environment variables. That way we will not have to modify anything on the server, so any existing functionality will continue to work and we simply "package" the endpoints into user-friendly functions on the client side.

@bisgaard-itis ok, but will this also work if the user (such as in sim4life.io) also calls the PATCH endpoint? will this not overwrite whatever was in there? Also I would prefer that the parent node id is not just some json field, but a defined one.

bisgaard-itis commented 5 months ago

After discussion with @bisgaard-itis :

proposal to modify the osparc python client:

* modify the API call to create a computational job to get an optional header containing at least the parent node ID

* based on ENV `OSPARC_NODE_ID` and possibly `OSPARC_STUDY_ID` variables set in the dynamic service,

* the client can automatically fill in the headers
  -> Users that are using the python client in their code will get that feature for free

After thinking a bit more about this I have the following modified proposal: Since this approach is based on the client "picking up" the node_id and sending it to the api-server I suggest to simply overwrite the create_solver_job method in the osparc python client, so that it first calls the endpoint on the api-server to create the job and afterwards calls the patch endpoint with the metadata picked up from the environment variables. That way we will not have to modify anything on the server, so any existing functionality will continue to work and we simply "package" the endpoints into user-friendly functions on the client side.

@bisgaard-itis ok, but will this also work if the user (such as in sim4life.io) also calls the PATCH endpoint? will this not overwrite whatever was in there? Also I would prefer that the parent node id is not just some json field, but a defined one.

This basically delegates all responsibility for setting the parent node_id to the client. So essentially the idea is to do in the python osparc client exactly what Manuel is already doing in the C++ client he is using from sim4life.io and wrap it into a user-friendly function by picking up the node_id from the env. I am not sure I understand exactly what you mean by a "defined field". In the end I guess it will be added in the metadata in the db in the same way Manuel is currently doing it, no?

sanderegg commented 5 months ago

After discussion with @bisgaard-itis :

proposal to modify the osparc python client:

* modify the API call to create a computational job to get an optional header containing at least the parent node ID

* based on ENV `OSPARC_NODE_ID` and possibly `OSPARC_STUDY_ID` variables set in the dynamic service,

* the client can automatically fill in the headers
  -> Users that are using the python client in their code will get that feature for free

After thinking a bit more about this I have the following modified proposal: Since this approach is based on the client "picking up" the node_id and sending it to the api-server I suggest to simply overwrite the create_solver_job method in the osparc python client, so that it first calls the endpoint on the api-server to create the job and afterwards calls the patch endpoint with the metadata picked up from the environment variables. That way we will not have to modify anything on the server, so any existing functionality will continue to work and we simply "package" the endpoints into user-friendly functions on the client side.

@bisgaard-itis ok, but will this also work if the user (such as in sim4life.io) also calls the PATCH endpoint? will this not overwrite whatever was in there? Also I would prefer that the parent node id is not just some json field, but a defined one.

This basically delegates all responsibility for setting the parent node_id to the client. So essentially the idea is to do in the python osparc client exactly what Manuel is already doing in the C++ client he is using from sim4life.io and wrap it into a user-friendly function by picking up the node_id from the env. I am not sure I understand exactly what you mean by a "defined field". In the end I guess it will be added in the metadata in the db in the same way Manuel is currently doing it, no?

@bisgaard-itis so the project metadata that Manuel is using are metadata that are owned by the user. we currently hack this out in order to get the parent NodeID. If your solution does not imply that the user may inadvertently remove the parent NodeID by explicitly calling the endpoint then I am ok.