FlagOpen / FlagScale

FlagScale is a large model toolkit based on open-sourced projects.
Other
132 stars 40 forks source link

[Profiler] Add group_info output #206

Open phoenixdong opened 2 weeks ago

phoenixdong commented 2 weeks ago

Description

This PR adds functionality to output group information for large model execution, helping to track and manage task distribution during runtime.

New Functionality

Note

This PR enables the output of parallel group information for both decoder and encoder modes.

Usage Instructions

To enable the output of parallel group information during model training, add the following configuration to your training file:

system:
  ...
  analyze:
    analyze_save_dir: group_info_output_path