openxla / xla

A machine learning compiler for GPUs, CPUs, and ML accelerators
Apache License 2.0
2.55k stars 394 forks source link

[XLA:GPU] Setting enabling memory pool access across devices #14891

Closed shawnwang18 closed 1 month ago

shawnwang18 commented 1 month ago

This PR enabling cross memory pool access for cuda_mallocasync allocator, otherwise it will lead to memory fault for intra node nccl operators