NVIDIA / topograph

A toolkit for discovering cluster network topology.
Apache License 2.0
6 stars 0 forks source link

Require metadata on all topology #26

Closed henryh2 closed 5 days ago

henryh2 commented 2 weeks ago

Requires metadata to be set on all topology, including both tree and block topology.

henryh2 commented 2 weeks ago

Two comments regarding this PR, the first of which is:

If the topology request payload specifies a topology plugin (either tree or block) that is different than the topology that can be provided, what should be the action? Return an error in the request?

henryh2 commented 2 weeks ago

And second, for MNNVL, is block topology always returned (current implementation), or should tree topology be returned in some situations?

dmitsh commented 2 weeks ago

If the topology request payload specifies a topology plugin (either tree or block) that is different than the topology that can be provided, what should be the action? Return an error in the request?

Toposim (and GTS) returns a collections of nodes with their local topology (a chain of switches and/or presence of nvlink). From here we can construct either tree or block topology. So, IIUC, it is up to the requester to define. IMO we should consider tree topology as default.