Open crazy-JiangDongHua opened 5 months ago
I also try to profile each worker like this : How ray support Nsight System Profiler 。But it has no effect, still no cuda hw line
# in vllm/executor/ray_gpu_executor.py:95
worker = ray.remote(
num_cpus=0,
num_gpus=num_gpus,
scheduling_strategy=scheduling_strategy,
runtime_env={ "nsight": "default"},
**ray_remote_kwargs,
)(RayWorkerVllm).remote(self.model_config.trust_remote_code)
This is a ray problem, which has just been solved. The detailed solution is in https://github.com/ray-project/ray/issues/42139#issuecomment-2141724352
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
Your current environment
🐛 Describe the bug
I try to profile vllm with nsys and run mixtral inference with tp_size = 8. When I opened the generated req file with nsys gui, I found that except for the rank 0 process, other processes did not capture the cuda kernel call, that is, there was no cuda hw line. This is the screenshot of nsys gui.
Below are nsys command and python script.