issues
search
arcee-ai
/
mergekit
Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
3.97k
stars
344
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Gemma2 support
#359
cg123
closed
5 hours ago
0
Support Qwen2 models with tied weights
#358
cg123
closed
6 hours ago
0
NuSLERP
#357
cg123
opened
3 days ago
0
Support of BitNet
#356
Ttimofeyka
opened
3 days ago
0
About the merging method used for Arcee-Spark
#355
daiquocnguyen
opened
4 days ago
0
Why all inputs to a slice must contain the same number of layers?
#354
Mihaiii
closed
1 week ago
1
Sincerely.. I have no words...without insulting someone.
#353
0wwafa
opened
1 week ago
2
MOE models that based on Gemma cannot work
#352
pharaohcaptain
opened
1 week ago
1
New method: MAP:Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
#351
duyvuleo
opened
1 week ago
0
qwen2-0.5B cannot be merged into MoE
#350
letterk
closed
6 hours ago
1
pass eval arguments directly in mergekit-evo
#349
johnwee1
opened
2 weeks ago
0
"Missing required parameter weight" for LoRA merge.
#348
bubbabug
closed
2 weeks ago
2
Add out_dtype option
#347
cg123
closed
2 weeks ago
0
Fix DBZ Error
#346
MonsterAzi
closed
1 week ago
0
New merge method / algorithm proposal: Geometric Median and TGMD merge
#345
6DammK9
opened
2 weeks ago
0
How to merge specific parameters
#344
qhz991029
opened
2 weeks ago
0
ValueError: operands could not be broadcast together with shapes (8192,28672) (8192,24576)
#343
ashmalvayani
opened
2 weeks ago
1
Need some help in merging same architectures, but with different tokens in their tokenizers
#342
choprahetarth
opened
3 weeks ago
4
Evolutionary Merging out of memory
#341
ArcherShirou
opened
3 weeks ago
3
Weights Metrics
#340
ElliotStein
opened
3 weeks ago
0
Use logscale for operations dealing with norm layers?
#339
jukofyork
opened
3 weeks ago
0
parameters: int8_mask: true ??
#338
David-AU-github
opened
4 weeks ago
1
Support for microsoft/Phi-3-vision-128k-instruct
#337
AshD
opened
4 weeks ago
0
Questions about density gradient and weight gradient in Ties example
#336
zhang-tuo-pdf
opened
4 weeks ago
0
Merge arbitrary pytorch models
#335
cg123
opened
4 weeks ago
0
Tokenizer merging overhaul
#334
cg123
opened
4 weeks ago
2
`extract_lora.py` improvements and fixes
#333
jukofyork
opened
1 month ago
6
Add --load-in-4bit and --load-in-8bit for HF eval backend
#332
cg123
opened
1 month ago
0
Bump dependencies
#331
cg123
closed
1 month ago
0
Fix mergekit-moe for older python
#330
cg123
closed
1 month ago
0
Does this run single core?
#329
Autumnlight02
opened
1 month ago
2
`extract_lora.py` can't handle mismatched `lm_head` tensor due to added tokens
#328
jukofyork
opened
1 month ago
10
How to merge only at q_proj layers with SLERP?
#327
yiyiwwang
closed
1 month ago
2
Require later version of transformers
#326
tleyden
closed
1 month ago
1
how to merge for different rope scaling?
#325
sangmandu
opened
1 month ago
3
Merging models with different structures in linear
#324
HawkClaws
opened
1 month ago
2
EvoMerge Genome Bug
#323
Jacobsolawetz
closed
3 weeks ago
2
Relax dependency versions
#321
Nugine
opened
1 month ago
0
Merge of hidden_size
#320
win10ogod
opened
1 month ago
0
How to merge a VLM and LLM with different model type.
#319
tanyakansal30
opened
1 month ago
0
Add support for `subfolder` loading
#318
xzuyn
opened
1 month ago
0
Mixed Precision Merging
#316
sais-github
opened
1 month ago
1
Fix size check in TensorWriter.save_tensor
#315
cg123
closed
1 month ago
0
[mergekit-extract-lora] cast to float before running SVD
#314
cg123
closed
1 month ago
1
Fix Qwen MoE architecture check
#313
cg123
closed
1 month ago
0
Qwen/Qwen1.5-1.8B MoE Merging fails
#312
dgolchin
closed
1 month ago
3
Existing Mergekit algorithms to merge VLM with LLM?
#310
ChaseKolozsy
opened
1 month ago
1
Implementation of AdaMerging: Adaptive Model Merging for Multi-Task Learning
#309
varunlmxd
opened
1 month ago
0
Merge only the transformer parts (including the input embedding layer)
#308
hank0316
opened
1 month ago
5
`mergekit-evolve` improvements
#307
cg123
closed
1 month ago
0
Next