bytedance / flux

A fast communication-overlapping library for tensor parallelism on GPUs.
Apache License 2.0
223 stars 17 forks source link

using c10::intrusive_ptr<c10d::ProcessGroup> as argument from python #13

Closed houqi closed 4 months ago

houqi commented 4 months ago

the problem

related to https://github.com/bytedance/flux/issues/11

image

Why

c10d::ProcessGroup as a c10::intrusive_ptr_target can be copied or moved, but with refcount to 0, which causes c10d::ProcessGroup release_resources after the first call.

Fix

using c10::intrusive_ptr as argument instead.

houqi commented 4 months ago

close #11