FedML-AI / FedML

FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.
https://TensorOpera.ai
Apache License 2.0
4.11k stars 773 forks source link

[CoreEngine] replace the queue with the managed queue to avoid the mu… #2178

Closed fedml-alex closed 2 weeks ago

fedml-alex commented 2 weeks ago

…ltiprocessing lock problem.