argonne-lcf / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
9 stars 11 forks source link

PyTorch profiler on all platforms #41

Open hatanp opened 3 months ago

hatanp commented 3 months ago

Validating and improving how profiling is done. Internal Ipex profiling example works in isolation, but with this Megatron-DeepSpeed we are missing the XPU outputs.

hatanp commented 3 months ago

Intel guide can be found at https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/features/profiler_kineto.html Notable are the known issues and env variables.