ai-computing / aicomp

Other
6 stars 0 forks source link

Preparing an alternative high-level IR execution framework for the training code #1

Open ememos opened 6 months ago

ememos commented 6 months ago

It seems necessary to switch from directly executing the low-level nodes of the IR graph transformed with torch.fx to executing the high-level GraphModule processed through split_module(). The reason for this is to address the relatively slow training of GPT-like models in the low-level execution framework, even though the training of composite models is fast.

ememos commented 5 months ago

Added the following file: