ray-project / rayfed

A multiple parties joint, distributed execution engine based on Ray, to help build your own federated learning frameworks in minutes.
https://rayfed.readthedocs.io
Apache License 2.0
92 stars 22 forks source link

Fed cleanup logic is not thread safe #130

Closed fengsp closed 1 year ago

fengsp commented 1 year ago

We are using multiple threads on the fed cleanup module, which has thread safety issues, for example:

  1. Main thread: global _check_send_thread is not None
  2. _check_send_thread: clear global _check_send_thread variable to None
  3. Main thread: calling _check_send_thread.join(), this would error because None object does not have join method

We should check all threading usage in ray fed and fix this.

jovany-wang commented 1 year ago

@zhouaihui @NKcqx CC as well.