Open prannaykhtech opened 5 months ago
@prannaykhtech All of the workers have different IPs starting from 10.114.141.217
, are the node IPs correct here? If so, then the rank assignment seems to make sense.
Could you provide some more information about the cluster configuration? What node types are available?
What happened + What you expected to happen
Logs show incorrect assignment of ranks:
I would have expected node_rank to be between [0-3] and local_rank to be between [0-7] since I have 8 ray workers per machine.
Versions / Dependencies
None
Reproduction script
Tough to provide more than this without the env.
Issue Severity
High