intelligent-machine-learning / dlrover

DLRover: An Automatic Distributed Deep Learning System
Other
1.22k stars 153 forks source link

Remove error code 128 from 'hardware-error' #1237

Closed BalaBalaYi closed 1 month ago

BalaBalaYi commented 1 month ago

What changes were proposed in this pull request?

Remove '128' from 'hard-ware-error'.

Why are the changes needed?

128 may not be the gpu error.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

UT.

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 80.48%. Comparing base (e50fe48) to head (975b1f7). Report is 6 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1237 +/- ## ======================================= Coverage 80.47% 80.48% ======================================= Files 217 217 Lines 19599 19602 +3 ======================================= + Hits 15773 15776 +3 Misses 3826 3826 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.