cisco-open / pymultiworld

A framework for PyTorch to enable fault management for collective communication libraries (CCL) such as NCCL
Apache License 2.0
15 stars 4 forks source link

doc: readme revision #57

Closed myungjin closed 2 months ago

myungjin commented 2 months ago

Description

To reflect changes made in v0.1.0, the top-level README.md is revised. Also, the description for examples is updated to highlight that the multiworld examples are fault-tolerant.

Type of Change

Checklist