Why does EAGLE remove the input_layernorm of llama?

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

https://arxiv.org/pdf/2406.16858

Apache License 2.0

780 stars 79 forks source link

Why does EAGLE remove the input_layernorm of llama? #76

Open bisunny opened 4 months ago

bisunny commented 4 months ago

Liyuhui-12 commented 3 months ago

The base model has a layer normalization (layernorm) layer before the LM head. Since the feature sequence has already been normalized, we do not use layer normalization.

haiduo commented 2 months ago

The base model has a layer normalization (layernorm) layer before the LM head. Since the feature sequence has already been normalized, we do not use layer normalization.

It is true that the base model has a layer normalization (layernorm) layer before the LM head, but this has nothing to do with EAGLE remove the input_layernorm of llama. I guess this is a trick to help improve Eagle accuracy?