bytedance / flux

A fast communication-overlapping library for tensor parallelism on GPUs.
Apache License 2.0
223 stars 17 forks source link

Support IPC && SM90 version of AG-GEMM, GEMM-RS #9

Closed zheng-ningxin closed 4 months ago

zheng-ningxin commented 4 months ago

Simultaneously supports IPC and NVSHMEM, allowing users to choose whether to enable NVSHMEM, and also supports two OPs of the SM90 version. Besides, Update the README accordlingly and add some performance data.