microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
https://www.deepspeed.ai/
Apache License 2.0
34.9k stars 4.06k forks source link

Disable nvtx decorator to avoid graph break #5697

Closed tohtana closed 3 months ago

tohtana commented 3 months ago

instrument_w_nvtx breaks a graph as range_push and range_pop return a non-tensor int. This PR disables the decorator to avoid the break graph.

This actually impacts the performance. In my environment, the training iteration time using Llama-3-8B/4GPUs/ZeRO1 is improved from 3.02s -> 2.54s.