THUDM / ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Other
15.68k stars 1.85k forks source link

ChatGLM2-6B的模型结构是encoder-only吗?[BUG/Help] <title> #528

Open smile-II opened 1 year ago

smile-II commented 1 year ago

Is there an existing issue for this?

Current Behavior

ChatGLMForConditionalGeneration( (transformer): ChatGLMModel( (embedding): Embedding( (word_embeddings): Embedding(65024, 4096) # [b s h] ) (rotary_pos_emb): RotaryEmbedding() (encoder): GLMTransformer( (layers): ModuleList( (0-27): 28X GLMBlock( (input_layernorm): RMSNorm() (self_attention): SelfAttention( # [b s h] (query_key_value): Linear(in_features=4096, out_features=4608, bias=True) (core_attention): CoreAttention( (attention_dropout): Dropout(p=0.0, inplace=False) ) (dense): Linear(in_features=4096, out_features=4096, bias=False) ) (post_attention_layernorm): RMSNorm() (mlp): MLP( (dense_h_to_4h): Linear(in_features=4096, out_features=27392, bias=False) (dense_4h_to_h): Linear(in_features=13696, out_features=4096, bias=False) ) ) ) (final_layernorm): RMSNorm() ) (output_layer): Linear(in_features=4096, out_features=65024, bias=False) ) )

Expected Behavior

这是打印出来的模型结构,论文中说GLM模型是encoder-decoder模型,但是为什么打印出来只有encoder

Steps To Reproduce

对模型结构有疑问

Environment

正常运行

Anything else?

No response