radical-collaboration / hpc-workflows

NSF16514 EarthCube Project - Award Number:1639694
5 stars 0 forks source link

Logging feature request #153

Open lsawade opened 2 years ago

lsawade commented 2 years ago

nnodes link

One of the great features of nnodes is a direct task logging ability while the jobs is running. We didn't know we needed this until we had it. Very simply put, it's a command line tool that let's you keep track of all jobs that are running, done, failed, or to be submitted. The example output for two moment tensor inversions is as follows:

- C090497A
  0) iteration
    0) mpi-create-dir-C090497A (04:50)
    1) forward_frechet (15:47)
    2) processing-all
      - C090497A_process_data (running - 01:29)
      - process_synthetics
        - C090497A_process_synt (running - 01:29)
        - C090497A_process_dsdm00000 (running - 01:29)
        - C090497A_process_dsdm00001 (running - 01:29)
        - C090497A_process_dsdm00002 (running - 01:29)
        - C090497A_process_dsdm00003 (running - 01:29)
        - C090497A_process_dsdm00004 (running - 01:29)
        - C090497A_process_dsdm00005 (running - 01:29)
        - C090497A_process_dsdm00006 (running - 01:29)
        - C090497A_process_dsdm00007 (running - 01:29)
        - C090497A_process_dsdm00008 (running - 01:29)
        - C090497A_process_dsdm00009 (running - 01:29)
    3) mpiexec_window
    4) compute_weights
    5) compute_cgh
    6) compute_descent
    7) compute_optvals
    8) linesearch
    9) iteration_check
- B092894B
  0) iteration
    0) mpi-create-dir-B092894B (04:51)
    1) forward_frechet
      - forward (14:47)
      - frechet
        - mpiexec_xspecfem3D (running - 15:44)
        - mpiexec_xspecfem3D (14:49)
        - mpiexec_xspecfem3D (14:48)
        - mpiexec_xspecfem3D (14:49)
        - mpiexec_xspecfem3D (running - 14:52)
        - mpiexec_xspecfem3D (running - 14:51)
        - mpiexec_xspecfem3D (14:49)
        - mpiexec_xspecfem3D (14:47)
        - mpiexec_xspecfem3D (running - 15:44)
    2) processing-all
    3) mpiexec_window
    4) compute_weights
    5) compute_cgh
    6) compute_descent
    7) compute_optvals
    8) linesearch
    9) iteration_check

where

In the backend, nnodes uses a dict to keep track of submission times, execution times etc. with start and endtime attributes for each task. The attributes are simply read and printed after reading the dictionary.

What I imagine is quite similar that could be called like:

radical-log-workflow <session.id>

and output a log with

and similar timestamps.

andre-merzky commented 2 years ago

Yes, that is indeed neat - accepted as feature request.