ai-computing / aicomp

Other
6 stars 0 forks source link

Elevate the level of abstraction from rank to stage in user code #9

Open ememos opened 3 months ago

ememos commented 3 months ago

In the user training code, instead of directly specifying the process rank, there is a need to raise the level of abstraction by specifying the stage. For instance, instead of the rank of world_size - 1, it should be possible to refer to it as the last_stage.

ememos commented 3 months ago

Stage handling functions have been introduced to the Topology class to handle stages instead of process ranks