ROCm / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
http://pytorch.org
Other
219 stars 51 forks source link

[ROCm] Intra-node all reduce initial implementation #1435

Closed jataylo closed 3 months ago

jataylo commented 3 months ago

Required for https://ontrack-internal.amd.com/browse/SWDEV-464040

pruthvistony commented 3 months ago

What is the plan to take this change to upstream?

jataylo commented 3 months ago

What is the plan to take this change to upstream?

@pruthvistony I have similar PR upstream https://github.com/pytorch/pytorch/pull/128826, I want to get some real feedback from inductor developers upstream to make sure this is the best way to handle this