ROCm / ROCmValidationSuite

The ROCm Validation Suite is a system administrator’s and cluster manager's tool for detecting and troubleshooting common problems affecting AMD GPU(s) running in a high-performance computing environment, enabled using the ROCm software stack on a compatible platform.
MIT License
56 stars 36 forks source link

gpup run need not stop at individual error #744

Closed manoj-freyr closed 1 month ago

manoj-freyr commented 1 month ago

for some GPUs if said properties are absent, we should report it and move on to next property instead of prematurely abandoning the rest of the flow.