aws-samples / awsome-distributed-training

Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.
MIT No Attribution
203 stars 86 forks source link

Change aws ofi plugin version 1.13.0 #501

Open mhuguesaws opened 6 days ago

mhuguesaws commented 6 days ago

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

mhuguesaws commented 5 days ago

DO NOT MERGE. Requires EFA 1.37.0 to use the latest aws ofi nccl.

mhuguesaws commented 17 hours ago

Currently broken due to tarball name change of the plugin release, https://github.com/aws/aws-ofi-nccl/issues/719