issues
search
mej
/
nhc
LBNL Node Health Check
Other
213
stars
78
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
scripts/lbnl_hw.nhc: nhc_hw_gather_data is slow at parsing /proc/cpuinfo
#149
sdiak
opened
3 months ago
3
Helper scripts are not called when the node fails the health check with Slurm
#147
szhengac
opened
8 months ago
13
Run tests in github action
#145
wpoely86
opened
8 months ago
1
lbnl_cmd.nhc: Remove line numbers from dmesg check
#144
mej
closed
9 months ago
0
check_cmd_dmesg() Reason Strings Cause Problems
#143
mej
closed
9 months ago
0
nhc: Print check numbers in verbose mode (#124)
#142
mej
closed
9 months ago
0
Question for faster execution: Seeing cpu_info add 10 secs to execution
#141
jebbaxley
opened
10 months ago
4
nhc: Add concurrency checking via PID file
#140
mej
closed
9 months ago
0
Question: Custom Check, How to exit without any changes, i.e. leave node in current state?
#139
flakrat
opened
10 months ago
3
Using something like an 'include' directive? or better with templates
#138
jbaksta
opened
1 year ago
2
Init checkin of UABRC config files
#137
flakrat
closed
1 year ago
1
Support for OpenPBS
#136
xpillons
opened
1 year ago
5
External Match doesn't work for me
#135
Aelmazaty
closed
1 year ago
3
Manually drained node got resumed after watchdog timer was unable to terminate hung NHC process
#134
indermuehle-unibe
opened
1 year ago
3
lbnl-nhc.spec.in: Allow for normal user access
#133
mej
closed
1 year ago
1
nhc: Refactor watchdog timer code for reuse
#132
mej
closed
1 year ago
1
nhc: Use subshell syntax for offline/online ops
#131
mej
closed
1 year ago
0
Pattern error message
#130
brunoagneray
opened
1 year ago
2
Improve Hostname Customization
#129
mej
opened
1 year ago
0
Version Flag
#128
mej
closed
11 months ago
1
Improve Script Safety with Checksum/Signature Verification?
#127
mej
opened
1 year ago
0
NHC Helpers vs. Unknown Slurm States
#126
mej
opened
1 year ago
2
Add option to output results in JSON
#125
DebRez
opened
1 year ago
0
Display check numbers in verbose mode
#124
mej
closed
9 months ago
0
Default output to stdout/stderr when the -e option is used
#123
mej
opened
1 year ago
0
"Fix" Permissions in RPM
#122
mej
opened
1 year ago
1
lbnl_hw: Fixes/speedups for procfs file reads
#121
mej
closed
1 year ago
1
`nhc_common_parse_size` doesn't support decimal values
#120
kcgthb
opened
1 year ago
1
bug in nhc_job_find_users() leading to misjudgment of illegal process
#119
taleintervenor
opened
1 year ago
0
nhc_hw_gather_data() too slow on large core count
#118
jpecar
closed
1 year ago
5
Fix unsafe string array conversion for passwd file parsing
#117
lebonez
closed
1 year ago
3
RELEASE_NOTES.txt does not include 1.4.3 content
#116
dmagdavector
opened
2 years ago
0
Numerous improvements to Slurm reboot handling and planned node state
#115
treydock
closed
1 year ago
0
Add csc_nvidia_smi.nhc to Makefile to get into RPM
#114
treydock
closed
1 year ago
0
Cannot find check_nvsmi_healthmon() in 1.4.3
#113
OleHolmNielsen
opened
2 years ago
4
check_hw_ib gives an unhelpful message when the adapter is missing
#111
OleHolmNielsen
opened
2 years ago
1
add option to tell nhc use long or short hostname when mark node state
#110
taleintervenor
closed
11 months ago
1
New release?
#109
mick-t
closed
2 years ago
4
Error in nhc/helpers/node-mark-offline
#108
quasigeek
closed
1 year ago
2
node-mark-offline case statement syntax error
#107
basvandervlies
closed
2 years ago
0
Change default CONFDIR etc
#106
zhum
opened
2 years ago
0
Check for cpu number and models
#105
zhum
opened
2 years ago
0
When using check_cmd_status, nhc leaks watchdog processes
#104
ghost
closed
11 months ago
4
missing double semicolon in helpers/node-mark-offline in current master
#103
smoors
closed
2 years ago
2
Missing EL8 RPM as well as instructions for building NHC
#102
OleHolmNielsen
opened
3 years ago
3
New functions: check_all_fs_used, check_all_fs_inodes, check_all_fs_i…
#101
OleHolmNielsen
opened
3 years ago
3
Support percentages in free memory checks
#100
mej
opened
3 years ago
1
check_ps_service sshd fails on Ubuntu
#99
heitorPB
opened
3 years ago
3
Add a new check_hw_numa check to verify NUMA configuration
#98
kcgthb
opened
3 years ago
0
Fix the #93 issue
#97
dgtim
closed
3 years ago
2
Next