intelligent-machine-learning / dlrover

DLRover: An Automatic Distributed Deep Learning System
Other
1.27k stars 167 forks source link

Refactor diagnosis manager #1318

Closed samplise closed 5 days ago

samplise commented 1 week ago

What changes were proposed in this pull request?

Refactor diagnosis manager.

Why are the changes needed?

Add the process framework for diagnosis on the master.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test

codecov[bot] commented 1 week ago

Codecov Report

Attention: Patch coverage is 92.07547% with 21 lines in your changes missing coverage. Please review.

Project coverage is 80.84%. Comparing base (3639e6e) to head (0613e20). Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
dlrover/python/elastic_agent/torch/training.py 36.36% 7 Missing :warning:
.../python/elastic_agent/diagnosis/diagnosis_agent.py 87.50% 5 Missing :warning:
...rover/python/master/diagnosis/diagnosis_manager.py 94.04% 5 Missing :warning:
...lrover/python/diagnosis/common/diagnosis_action.py 80.00% 1 Missing :warning:
dlrover/python/diagnosis/common/inference_chain.py 75.00% 1 Missing :warning:
dlrover/python/elastic_agent/context.py 94.73% 1 Missing :warning:
.../python/master/diagnosis/diagnosis_data_manager.py 96.15% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1318 +/- ## ========================================== + Coverage 80.67% 80.84% +0.17% ========================================== Files 225 229 +4 Lines 21269 21540 +271 ========================================== + Hits 17159 17415 +256 - Misses 4110 4125 +15 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.