arcee-ai / mergekit

Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
4.88k stars 446 forks source link

Pad embeds to multiple #465

Closed cg123 closed 1 day ago

cg123 commented 1 day ago

Add the ability to pad the output embeddings to a multiple of a user-defined factor when merging tokenizers.

Config syntax example:

merge_method: linear
models:
  - model: model_a
  - model: model_b
parameters:
  weight: 0.5
tokenizer:
  source: union
  pad_to_multiple_of: 64