arcee-ai / mergekit

Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
4.57k stars 406 forks source link

Merge of hidden_size #320

Open win10ogod opened 4 months ago

win10ogod commented 4 months ago

Merge of hidden_size Please tell me, have you started to implement Merge of hidden_size? For example: "hidden_size": 4096, "initializer_range": 0.02, "intermediate_size": 14336, "max_position_embeddings": 1048576, "model_type": "llama", "num_attention_heads": 32, "num_hidden_layers": 32,

and

"hidden_size": 4096, "initializer_range": 0.02, "intermediate_size": 14336, "max_position_embeddings": 1048576, "model_type": "llama", "num_attention_heads": 32, "num_hidden_layers": 32,

After merger

"hidden_size": 8192, "initializer_range": 0.02, "intermediate_size": 14336, "max_position_embeddings": 1048576, "model_type": "llama", "num_attention_heads": 32, "num_hidden_layers": 32,