intel / torch-xpu-ops

Apache License 2.0
30 stars 21 forks source link

Kenito profiler: UT failure after enable PTI (TestAutograd::test_profiler) #731

Open fengyuan14 opened 3 months ago

fengyuan14 commented 3 months ago

🐛 Describe the bug

2024-08-08T06:02:04.0982790Z =================================== FAILURES ===================================
2024-08-08T06:02:04.0985298Z __________________________ TestAutograd.test_profiler __________________________
2024-08-08T06:02:04.0985755Z Traceback (most recent call last):
2024-08-08T06:02:04.0986502Z   File "/home/sdp/actions-runner-0/_work/torch-xpu-ops/pytorch/third_party/torch-xpu-ops/test/xpu/../../../../test/test_autograd.py", line 4603, in test_profiler
2024-08-08T06:02:04.0987176Z     with profile(use_kineto=kineto_available()) as p:
2024-08-08T06:02:04.0987850Z   File "/home/sdp/miniforge3/envs/xpu_op_0/lib/python3.10/site-packages/torch/autograd/profiler.py", line 315, in __enter__
2024-08-08T06:02:04.0988397Z     self._prepare_trace()
2024-08-08T06:02:04.0988976Z   File "/home/sdp/miniforge3/envs/xpu_op_0/lib/python3.10/site-packages/torch/autograd/profiler.py", line 322, in _prepare_trace
2024-08-08T06:02:04.0989570Z     _prepare_profiler(self.config(), self.kineto_activities)
2024-08-08T06:02:04.0990166Z RuntimeError: Fail to enable Kineto Profiler on XPU due to error code: 200
2024-08-08T06:02:04.0990450Z 
2024-08-08T06:02:04.0990618Z To execute this test, run the following from the base repo dir:
2024-08-08T06:02:04.0992197Z     PYTORCH_TEST_WITH_SLOW=1 python test/test_autograd.py TestAutograd.test_profiler
2024-08-08T06:02:04.0992562Z 
2024-08-08T06:02:04.0992791Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2024-08-08T06:02:04.0993420Z =========================== short test summary info ============================
2024-08-08T06:02:04.0996133Z FAILED test_autograd_xpu.py::TestAutograd::test_profiler - RuntimeError: Fail to enable Kineto Profiler on XPU due to error code: 200

Versions

Latest PyTorch

fengyuan14 commented 3 months ago

Should assign to Zejun, but he is not the member of the project so far.