Atoptool / atop

System and process monitor for Linux
GNU General Public License v2.0
789 stars 110 forks source link

Storing regex matching environment variables in the existing cmdine variable #221

Closed jbd closed 1 year ago

jbd commented 1 year ago

The goal is to store selected environment variables per process.

The -z regex option can be used to capture some environment variables and showing them by prepending them to the displayed command line. They are stored directly in the existing curtask->gen.cmdline variable which is convenient for the already existing filtering mechanism. It consumes space that will not be available for the real command line though (related to https://github.com/Atoptool/atop/issues/101).

For example, when using the SLURM HPC job scheduler, each job inherits SLURM_* variables like the jobid. It is very useful to have the job id directly accessible in atop, avoiding a manual and tedious search within the scheduler job history. With this patch, using the following atop command line and submitting an interactive SLURM job shell:

atop -z 'SLURM_JOBID|SLURM_STEP_NUM_TASKS'

The full command line display will be, after a 'SLURM' filter:

    PID     TID S   CPU COMMAND-LINE (horizontal scroll with <- and -> keys)                                                                            1/1
3686269       - R    0% SLURM_JOBID=51443556 SLURM_STEP_NUM_TASKS=1 ./atop -z SLURM_JOBID|SLURM_STEP_NUM_TASKS
3674287       - S    0% SLURM_JOBID=51443556 SLURM_STEP_NUM_TASKS=1 /bin/bash
sagb commented 1 year ago

Alternatively, with #230, you can parse /proc/PID/environ with external script and transform it to more readable "command line", say, "id:51443556 nt:1".

jbd commented 1 year ago

This might do the trick indeed, but it involves spawning a process for the external script. I think my approach is better for my specific use case.

I do hope this PR will be discussed, it is quite useful for us !

kpetrov commented 1 year ago

a very cool idea, would give us ability to quickly navigate among dozens of jobs which are running on the server. It was not needed before with just several cores, but with 128 it seems imperative