GoogleCloudPlatform / ai-infra-cluster-provisioning

Apache License 2.0
37 stars 25 forks source link

Update README.md for all customers to cover all-to-all #362

Closed samcmho closed 7 months ago

samcmho commented 7 months ago

Add a note regarding setting ulimit -n 1048576 if orchestrators relies on SSH to launch processes to run communication patterns doing send-recvs between many GPU pairs