InternLM / InternEvo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
https://internevo.readthedocs.io/zh-cn/latest/?badge=latest
Apache License 2.0
311 stars 52 forks source link

fix(enable_qkv_fusion): minor fix for qkv fusion #340

Closed zigzagcai closed 2 months ago

zigzagcai commented 2 months ago

Motivation

  1. (Fix for this PR, https://github.com/InternLM/InternEvo/pull/338) q_dim and kv_dim should be divided in the nn.Modules hooks, when tp/wp is enabled.
  2. refine dispatch modules: move model.to(device) to inject_model_helper, to support larger parameter size

Modification

Please briefly describe what modification is made in this PR.

BC-breaking (Optional)

None

Use cases (Optional)

None

Checklist

Before PR:

After PR: