LLNL / serac

Serac is a high order nonlinear thermomechanical simulation code
BSD 3-Clause "New" or "Revised" License
178 stars 31 forks source link

Python subprocess logging improvement + add `make` job option to build scripts #1110

Closed chapman39 closed 3 months ago

chapman39 commented 3 months ago

This PR aims to improve the sexe python function so that subprocess output can be written to both stdout and file simultaneously, line by line. This is useful for when CI jobs crash before creating any log.

Also add make job option to build scripts. This is useful because in CI, docker containers should run make with a smaller job count (like 4), whereas on LLNL systems the job count can be much greater (like 16).

Fixes #1107

codecov-commenter commented 3 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 90.67%. Comparing base (1de03f4) to head (6d017f7).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #1110 +/- ## ======================================== Coverage 90.67% 90.67% ======================================== Files 163 163 Lines 14536 14536 ======================================== Hits 13181 13181 Misses 1355 1355 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

chapman39 commented 3 months ago

With this new change, there is only one subprocess call - making the function a bit cleaner. The only noticeable difference from before is now in the event of an error, it will always print an error message. Before, it would only print an error message if either output file was None or return output was False.

A set of examples testing the script:

shell_exec("echo hello file", output_file="/usr/workspace/meemee/serac/log0.txt")
shell_exec("echo hello screen", print_output=True)
shell_exec("echo hello to both", output_file="/usr/workspace/meemee/serac/log3.txt", print_output=True)
shell_exec("cho oops!", output_file="/usr/workspace/meemee/serac/log2.txt", print_output=True)
shell_exec("echo cant see me!")
shell_exec("ech mistake with no output!", error_prefix="WARNING:")

rcode, output = shell_exec("echo returning output", return_output=True)
print("printing the output in test.py: {0}".format(output))
[meemee@ruby968:repo]$ ./scripts/llnl/test.py 
hello screen
hello to both
/bin/sh: cho: command not found
[ERROR: [return code: 127] from command: cho oops!]
[WARNING: [return code: 127] from command: ech mistake with no output!]
printing the output in test.py: returning output

[meemee@ruby968:repo]$ cat ../log?.txt
hello file
/bin/sh: cho: command not found
hello to both
chapman39 commented 3 months ago

Azure using 4 jobs: https://dev.azure.com/llnl-serac/serac/_build/results?buildId=11313&view=logs&j=d86163eb-53f2-5090-1bd9-76480b1ac38b&t=67a51bbe-48ca-58cf-58a4-5e9dff7df8a8&l=304

LC using 16 jobs (CTRL+F make -j 16): https://lc.llnl.gov/gitlab/smith/serac/-/jobs/1836806/raw , which I think saves ~2 minutes per job.

chapman39 commented 3 months ago

I think there's some issues checking if "not empty string" vs "empty string" in python. ill push a fix soon

edit: should be fixed now