meta-llama / llama

Inference code for Llama models
Other
56.11k stars 9.54k forks source link

Why is the value of hidden_dim in FeedForward calculated this way? #1019

Open wjwzju opened 8 months ago

wjwzju commented 8 months ago

Why is the value of hidden_dim calculated this way?

hidden_dim = int(2 * hidden_dim / 3)

custom dim factor multiplier

if ffn_dim_multiplier is not None: hidden_dim = int(ffn_dim_multiplier hidden_dim) hidden_dim = multiple_of ((hidden_dim + multiple_of - 1) // multiple_of)

 

y1X1ao commented 1 week ago

int(2 hidden_dim / 3) can reduce the computation overhead without losing effectiveness hidden_dim = multiple_of ((hidden_dim + multiple_of - 1) // multiple_of) make sure the new hiddem_dim is the mulitiple of the number you want