microsoft / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.9k stars 345 forks source link

[XPU] Enable empty cache on XPU device #438

Closed ys950902 closed 2 months ago

ys950902 commented 2 months ago

Support the flag empty-unused-memory-level ob XPU device, not only for CUDA device.

ys950902 commented 2 months ago

Hi @tjruwase, could you please take a look on this pr, we want to use the flag empty-unused-memory-level to reduce memory on XPU device.

tjruwase commented 2 months ago

@ys950902 and @polisettyvarma, thanks for this PR. In general, we want to replace all torch.cuda references with get_accelerator(), so these kinds of PRs are greatly appreciated.