issues
search
arcee-ai
/
mergekit
Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
4.88k
stars
446
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
I am having problem merging GPT-Neo
#409
2625554780
opened
3 months ago
1
support for GPT-Neo needed!
#408
2625554780
closed
3 months ago
2
Is it possible to merge Mistral 7B and Mistral NeMo 12B?
#407
azulika
opened
3 months ago
1
Set Gemma2 lm_head optional instead of aliasing to embed_tokens
#406
cg123
closed
3 months ago
0
Add Phi3SmallForCausalLM and tweak Phi3
#405
cg123
closed
3 months ago
0
小白怎么合并模型 yaml文件配置
#404
yhyub
opened
3 months ago
1
怎么解决mergekit-yaml qwen_sail.yaml ./fddfgh/ Warmup loader cache: 100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 2/2 [00:00<00:00, 64.02it/s] Executing graph: 0%| | 0/1457 [00:00<?, ?it/s]Segmentation fault
#403
yhyub
opened
3 months ago
0
解决运行错误
#402
yhyub
opened
3 months ago
1
Merging two mistral based models with different architectures. Looking for some guidance.
#401
AshD
opened
3 months ago
1
Example of a config file for task_arithmetic 'negative' operation and a case for 'Task analogies'
#400
eunbin079
opened
3 months ago
1
Working Example of the Mergkit-Evo
#399
nthangelane
opened
3 months ago
0
passthrough merge error: Tensor model.layers.86.self_attn.k_norm.weight required but not present in model mistralai/Mistral-Large-Instruct-2407
#398
AshD
closed
3 months ago
2
MergeKit GUI not working.
#397
Abdulhanan535
closed
3 months ago
0
Support for Phi-3-Small [Feature ?]
#396
hammoudhasan
opened
3 months ago
0
Error at MoE Qwen 1.5B
#395
ehristoforu
closed
3 months ago
3
Null vocab_file Issue with mistral v03 based models when using union tokenizer source
#394
guillermo-gabrielli-fer
opened
3 months ago
2
Is there a way to run LORA extraction using multi GPU? 70B LORA extraction OOM on 24GB 3090Ti
#393
Nero10578
opened
3 months ago
4
Example case of task_arithmetic needed
#392
Opdoop
opened
3 months ago
1
MoE exits itself after expert prompts 100% 2/2
#391
SameedHusayn
opened
3 months ago
0
mergekit saves tied and ignored weights unlike what transformers does when saving
#390
nyxkrage
opened
3 months ago
0
Create Communication Channels for MergeKit
#389
aditya-cherukuru
opened
3 months ago
0
The speed issue with the GTATask.
#388
daidaiershidi
opened
3 months ago
3
ABM corrections
#387
metric-space
opened
4 months ago
0
How to Create a New Merging Method
#386
Guozhenyuan
opened
4 months ago
1
Result of merging 2 Gemma2 9B models gains 1B parameters somehow
#385
jim-plus
closed
3 months ago
6
does not appear to have a file named config.json
#383
bxf1001
opened
4 months ago
2
Added support for DeepseekV2 model
#382
aditya-29
opened
4 months ago
3
RuntimeError: Unsupported architecture BertForSequenceClassification
#381
lrsbrgrn
opened
4 months ago
0
[request] Llama 3.1 Support
#380
discordianbelle
closed
4 months ago
1
mergekit-moe支持qwen吗?
#379
hoooooli
opened
4 months ago
3
Questions about Config
#378
Zheng-Jay
opened
4 months ago
2
mergekit-evolve doesn't account for higher_is_better: false tasks.
#377
mekaneeky
opened
4 months ago
1
merging native pytorch model locally
#376
sorobedio
opened
4 months ago
1
Network is unreachable
#375
guanfaqian
closed
4 months ago
1
evolve-merge installation argument not working
#374
sorobedio
opened
4 months ago
0
Question about the merge of the Dare method.
#373
guanfaqian
opened
4 months ago
0
Examples needed.
#372
0wwafa
opened
4 months ago
3
Awesome repo, but can it convert multiple architectures to Llama?
#371
BBC-Esq
opened
4 months ago
0
remove strict version of pydantic
#370
sreev
closed
1 month ago
1
RuntimeError: Unsupported architecture Qwen2ForSequenceClassification
#369
iseesaw
closed
4 months ago
3
duplicate folders labelled 'mergekit' found in mergekit install
#368
jpeek34556
opened
4 months ago
0
Specify chat template for output model
#367
cg123
closed
4 months ago
0
Add Della merge method
#366
Tej-Deep
closed
4 months ago
6
Activation based merging - copied over from wip-zipit branch
#365
metric-space
closed
4 months ago
0
gracefully pause evolutionary optimization?
#364
johnwee1
opened
4 months ago
1
Support for fine-grained experts in MoE models
#363
misdelivery
opened
4 months ago
0
Add support for Internlm2
#362
Crystalcareai
closed
4 months ago
2
Thera are still some problems with moe merge qwen with other LLM(like llama,deepseek,etc)
#361
aoyinke
opened
4 months ago
3
Condense a models layers.
#360
DewEfresh
opened
5 months ago
1
Gemma2 support
#359
cg123
closed
5 months ago
0
Previous
Next