microsoft / superbenchmark

A validation and profiling tool for AI infrastructure
https://aka.ms/superbench
MIT License
248 stars 55 forks source link

Monitor - Add support for AMD GPU. #580

Closed guoshzhao closed 9 months ago

guoshzhao commented 9 months ago

Description Add AMD support in monitor.

Major Revision

codecov[bot] commented 9 months ago

Codecov Report

Attention: 33 lines in your changes are missing coverage. Please review.

Comparison is base (1ad1c21) 86.77% compared to head (31c1be0) 86.41%.

Files Patch % Lines
superbench/common/utils/device_manager.py 54.16% 33 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #580 +/- ## ========================================== - Coverage 86.77% 86.41% -0.37% ========================================== Files 96 96 Lines 6475 6544 +69 ========================================== + Hits 5619 5655 +36 - Misses 856 889 +33 ``` | [Flag](https://app.codecov.io/gh/microsoft/superbenchmark/pull/580/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | Coverage Δ | | |---|---|---| | [cpu-python3.6-unit-test](https://app.codecov.io/gh/microsoft/superbenchmark/pull/580/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | `71.76% <44.59%> (-0.09%)` | :arrow_down: | | [cpu-python3.7-unit-test](https://app.codecov.io/gh/microsoft/superbenchmark/pull/580/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | `71.76% <44.59%> (-0.09%)` | :arrow_down: | | [cpu-python3.8-unit-test](https://app.codecov.io/gh/microsoft/superbenchmark/pull/580/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | `72.20% <44.59%> (-0.10%)` | :arrow_down: | | [cuda-unit-test](https://app.codecov.io/gh/microsoft/superbenchmark/pull/580/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | `84.34% <45.94%> (-0.44%)` | :arrow_down: | | [directx-unit-test](https://app.codecov.io/gh/microsoft/superbenchmark/pull/580/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft) | `35.59% <47.29%> (+0.36%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

yukirora commented 9 months ago

will you change the exec function in executor.py, no way to enable monitor on rocm platform currently

if self.__get_platform() == Platform.CUDA:
                    monitor = Monitor(
                        None, int(self._sb_monitor_config.sample_duration or 10),
                        int(self._sb_monitor_config.sample_interval or 1), self.__get_monitor_path(benchmark_name)
                    )
                    monitor.start()
                else:
                    logger.warning('Monitor can not support ROCM/CPU platform.')
guoshzhao commented 9 months ago

will you change the exec function in executor.py, no way to enable monitor on rocm platform currently

if self.__get_platform() == Platform.CUDA:
                    monitor = Monitor(
                        None, int(self._sb_monitor_config.sample_duration or 10),
                        int(self._sb_monitor_config.sample_interval or 1), self.__get_monitor_path(benchmark_name)
                    )
                    monitor.start()
                else:
                    logger.warning('Monitor can not support ROCM/CPU platform.')

Good catch. Thanks.