为什么LLaMa模型只有encoder没有decoder

ProjectD-AI / llama_inference

llama inference for tencentpretrain

GNU General Public License v3.0

96 stars 11 forks source link

Open yyqi17 opened 1 year ago

yyqi17 commented 1 year ago

想请教一下为什么model/llama.py内构建的LLaMa模型只有Transformer-encoder？后面直接linear输出，这样符合原始llama模型的结构吗？会影响效果吗？谢谢～

fengyh3 commented 1 year ago

本质上模型结构是一样的，只是里面的attention mask不一样，这里是encoder + causal attention mask，所以没问题的。