Selection of LLMs architecture

bigcode-project / Megatron-LM

Ongoing research training transformer models at scale

Other

376 stars 49 forks source link

Selection of LLMs architecture #7

Open maoquan-ms opened 2 years ago

maoquan-ms commented 2 years ago

this project seems to pre-train decoder-only style LM. just wonder why not encoder-decoder style which more powerful for text generation (translation, summarization, conditional text generation).