Preparing an alternative high-level IR execution framework for the training code

ai-computing / aicomp

Other

6 stars 0 forks source link

Preparing an alternative high-level IR execution framework for the training code #1

Open ememos opened 6 months ago

ememos commented 6 months ago

It seems necessary to switch from directly executing the low-level nodes of the IR graph transformed with torch.fx to executing the high-level GraphModule processed through split_module(). The reason for this is to address the relatively slow training of GPT-like models in the low-level execution framework, even though the training of composite models is fast.

ememos commented 5 months ago

Added the following file:

fx_dist_pp_training_type-C_gpt2_gpu.py