FedML-AI / FedML

FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.
https://TensorOpera.ai
Apache License 2.0
4.17k stars 786 forks source link

Enhance Train Package Build #1965

Closed alaydshah closed 7 months ago

alaydshah commented 7 months ago

Fixes:

  1. Resolved conflicts if user code had bootstrap.sh file name by modifying fedml generated bootstrap file name from bootstrap.sh to fedml_bootstrap_generated.sh
  2. Updated file clean up logic, making it bit more modular