issues
search
arcee-ai
/
mergekit
Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
4.88k
stars
446
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Pad embeds to multiple
#465
cg123
closed
1 day ago
0
Better tied weight handling
#464
cg123
closed
1 day ago
0
Handle optional weights in mergekit-moe
#463
cg123
closed
1 day ago
0
Rewrite readme more novice-friendly
#462
clover1980
opened
4 days ago
0
mergekit for vision models
#461
prince0310
opened
5 days ago
0
Why are the names of parameters hard-coded? Is it possible to read it from index.json in HF checkpoints?
#460
zhangzx-uiuc
opened
1 week ago
1
Qwen2.5 LoRA Extraction not working in vLLM & Aphrodite Engine
#459
Nero10578
opened
1 week ago
2
add whisper model
#458
sagewe
opened
2 weeks ago
0
Critical Merging Bug just started...
#457
David-AU-github
opened
2 weeks ago
1
About Model-Breadcrumbs merge implementation
#455
vishaal27
opened
3 weeks ago
0
Base Model generation time increases when passed through the MergeKit
#454
ahmedamrelhefnawy
opened
3 weeks ago
0
N-model ModelStock merging
#453
vishaal27
opened
3 weeks ago
1
Moe merging failed
#452
PsoriasiIR
opened
3 weeks ago
2
Use sst2 to eval merging
#451
VivianeGalvao
closed
4 weeks ago
0
Merge Models with Non-Standard Architectures (e.g., Multimodal Models)
#450
ElliotStein
opened
1 month ago
3
[question] multi gpu available?
#449
eunbin079
opened
1 month ago
0
Bump version number
#448
cg123
closed
1 month ago
0
mergekit-extract-lora does not extract - the destination is empty
#447
raulod
opened
1 month ago
2
KeyError model[0] did not exist in tensor?
#446
FrozzDay
opened
1 month ago
2
Report issues regarding the architecture-agnostic branch.
#445
win10ogod
opened
1 month ago
3
Bump dependencies
#444
cg123
closed
1 month ago
0
RuntimeError: Need to specify cache dir to merge adapters
#442
Zolilio
closed
1 month ago
1
Add methods from https://arxiv.org/abs/2405.07813
#441
zsgvivo
closed
1 day ago
2
add methods from https://arxiv.org/abs/2405.07813
#440
zsgvivo
closed
1 month ago
0
11
#439
meiyiyeshi
closed
1 month ago
0
[question] `task_arithmetic` simple question
#438
eunbin079
closed
1 month ago
2
After the two Qwen1.5-7B-chat models were merged, garbled inference results appeared.
#437
Zhangfanfan0101
closed
1 month ago
0
Fixed the YML/YAML documentation for Qwen MoE creation
#435
Nottlespike
opened
1 month ago
1
[request] Support for Vision Language Models
#434
NickGao96
closed
1 month ago
13
[request]Can it support architectures such as stable diffusion Xl and flux dev?
#433
win10ogod
opened
1 month ago
2
Initial implementation of PCB merge method
#432
cg123
opened
1 month ago
0
Update actors.py
#431
kwon13
opened
1 month ago
0
Handle merges stored as list instead of space-separated string
#430
cg123
closed
1 month ago
0
Update Llama architecture to handle 3b/1b
#429
cg123
closed
1 month ago
0
Broken tokenizer in Yi-34B merge
#428
Asherathe
closed
1 month ago
3
I would like to merge the deepseekForCausalLM model. Are there any related examples available?
#427
xaiocaibi
opened
2 months ago
0
Merging Lora fine-tuned models with MoE
#426
AmineBechar07
opened
2 months ago
0
Qwen2.5 14B models are ... sometimes? ... having their token vocabulary truncated down to 'actual'?
#425
ann-brown
opened
2 months ago
4
Support for new Llama 3.2 - 1B / 3B ?
#424
David-AU-github
closed
1 month ago
13
Support for Vision Model such as ViT
#423
redagavin
opened
2 months ago
0
Support for xlm-roberta
#422
umiron
opened
2 months ago
2
"mergekit-yaml" not created upon installation
#421
BovineOverlord
opened
2 months ago
2
How to use multi GPUs
#420
liudan193
opened
2 months ago
1
would you like to support Qwen2.5 Model?
#419
ArcherShirou
closed
2 months ago
1
Input should be a valid dictionary or instance of MergeConfiguration
#418
Hugo-Calero
opened
2 months ago
2
Make Cohere lm_head optional
#417
cg123
closed
2 months ago
0
Add Solar And Exaone Model
#416
shing100
closed
2 months ago
1
Add support Exaone Model
#415
shing100
closed
2 months ago
2
Re-Train every block with reduced width
#414
snapo
closed
2 months ago
0
Fix README links
#413
cg123
closed
3 months ago
0
Next