cisco-open / pymultiworld

A framework for PyTorch to enable fault management for collective communication libraries (CCL) such as NCCL
Apache License 2.0
15 stars 4 forks source link

doc: revise readme #8

Closed myungjin closed 4 months ago

myungjin commented 4 months ago

Description

Explanation on faulty situation is added.

Type of Change

Checklist