NVIDIA / Megatron-LM

Ongoing research training transformer models at scale
https://docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start
Other
9.23k stars 2.08k forks source link

[QUESTION] Has standalone_embedding_stage been supported yet in core? #890

Open JiwenJ opened 3 days ago

JiwenJ commented 3 days ago

I met an issue and want to split the embedding layer out of transformer block to make it alone in single pp stage, but I found that it has not been supported in core. Am I right? https://github.com/NVIDIA/Megatron-LM/blob/e33c8f78a35765d5aa37475a144da60e8a2349d1/megatron/core/transformer/transformer_block.py#L164