cnr-ibf-pa / hbp-bsp-issues

Ticketing system for developers/testers and power users of the Brain Simulation Platform of the Human Brain Project
4 stars 0 forks source link

Register and store a simulation outputs located on the HPC system into KG #459

Open alex4200 opened 5 years ago

alex4200 commented 5 years ago

Use Case Update

Aspect Detail
Summary Based on UC11 (see below) the results of a simulation (SimUI) needs to be registered in the KnowledgeGraph
Expert Stefano
Deadline end of SGA2
Target User Each HBP member
Dependencies KG, HPC systems, other
Related https://gitlab.humanbrainproject.org/joint_infrastructure_coordination/Coordination/issues/123 https://gitlab.humanbrainproject.org/joint_infrastructure_coordination/Coordination/issues/49

List of additional/changed features

Register SimData

The idea is to register each Simulation result from the SimUI. The actual steps are the following:

  1. The user starts a simulation. Each user has HBP credentials, but might not have an account for each HPC system.
  2. The simulation creates some outputs (out.dat, soma.bbp, etc). MOOC: <=300 MB, normal user can be > some GB
  3. The outputs should go to a database (e.g. a location on CSCS), while metadata gets registered in the KnowledgeGraph. The content might be
    • name of the simulation
    • date of the simulation
    • user
    • used circuit
    • HPC system used
    • number of nodes, CPUS..
    • status of the sim (temp, persistent)
    • location of BlueConfig
    • location of SimulationResult Only metadata are stored in the KnowledgeGraph, not actual data (like files, traces, images...).
  4. A 'DataMover' is triggered, or checks regularly, if there are files to be copied from a HPC system to a temporary database (e.g. GPFS storage in CSCS). If file is there, file will be copied and the metadata of the KnowledgeGraph objects gets updated (location of SimulationResult).
  5. The user can run a notebook which queries the KG to find the location of the data. The user can analyze the results, and add additional results to that object (data, images etc). If the user is satisfied with the results and wants to publish them, he/she notifies the KG
  6. The 'DataMover' gets involved again, moving the actual data (SimResults, files, data, images; maybe just a folder) to the persistent location. The metadata of the KnowledgeGraph objects gets updated (location of SimulationResult).

Tasks

Acceptance Criteria

Define here the acceptance tests to evaluate the use case’s compliance with the requirements as defined above. Also possible end users for testing can be included here.

Extra Requirements

System

Performance

1-24 hours to complete the task from start to finish (including registering, file copy ...)

antonelepfl commented 5 years ago

So after the meeting I was thinking and I think we should go to the creation of the backend of the SimUI. This back-end of the SimUI will communicate with the UI to run the simulation, then to Unicore to actually run it and after Unicore finished, will be in charge of register the results to the KG if it's successful (for the time being). Before submitting the metadata to the KG this back-end will take care of forming the JSON-LD that we need to pass it for the registration.

I would like to take your advices in order not to forget any piece:

To communicate with Unicore:

To communicate with KG

alex4200 commented 5 years ago

@olinux FYI

olinux commented 5 years ago

To communicate with KG: As discussed, I would suggest a technical user for the backend which executes the operations on behalf of the user. You then can also ensure that the token can be refreshed properly.

alex4200 commented 5 years ago

@olinux Do you have any news on this item, especially on this technical user we could use?

alex4200 commented 4 years ago

see task https://gitlab.humanbrainproject.org/joint_infrastructure_coordination/Coordination/issues/123 see task https://gitlab.humanbrainproject.org/joint_infrastructure_coordination/Coordination/issues/49

antonelepfl commented 4 years ago

From my side, I don't think I'll have time to work on this these weeks. We should move it to SGA3