cisco-open / pymultiworld

A framework for PyTorch to enable fault management for collective communication libraries (CCL) such as NCCL
Apache License 2.0
16 stars 4 forks source link

doc: readme revision #57

Closed myungjin closed 3 months ago

myungjin commented 3 months ago

Description

To reflect changes made in v0.1.0, the top-level README.md is revised. Also, the description for examples is updated to highlight that the multiworld examples are fault-tolerant.

Type of Change

Checklist