intelligent-machine-learning / dlrover

DLRover: An Automatic Distributed Deep Learning System
Other
1.27k stars 167 forks source link

Increase heartbeat timeout #1204

Closed BalaBalaYi closed 4 months ago

BalaBalaYi commented 4 months ago

What changes were proposed in this pull request?

Increase the timeout from 5mins to 10mins.

Why are the changes needed?

Training worker will exit in 5mins. So the heartbeat timeout should > 5mins.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

UT.

codecov[bot] commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 79.98%. Comparing base (8862fa5) to head (bbe0407). Report is 3 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1204 +/- ## ======================================= Coverage 79.97% 79.98% ======================================= Files 215 215 Lines 19040 19046 +6 ======================================= + Hits 15227 15233 +6 Misses 3813 3813 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.