arcee-ai mergekit issues

arcee-ai / mergekit

Tools for merging pretrained large language models.

GNU Lesser General Public License v3.0

4.88k stars 446 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

I am having problem merging GPT-Neo

#409 2625554780 opened 3 months ago
1
support for GPT-Neo needed!

#408 2625554780 closed 3 months ago
2
Is it possible to merge Mistral 7B and Mistral NeMo 12B?

#407 azulika opened 3 months ago
1
Set Gemma2 lm_head optional instead of aliasing to embed_tokens

#406 cg123 closed 3 months ago
0
Add Phi3SmallForCausalLM and tweak Phi3

#405 cg123 closed 3 months ago
0
小白怎么合并模型 yaml文件配置

#404 yhyub opened 3 months ago
1
怎么解决mergekit-yaml qwen_sail.yaml ./fddfgh/ Warmup loader cache: 100%|▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒| 2/2 [00:00<00:00, 64.02it/s] Executing graph: 0%| | 0/1457 [00:00<?, ?it/s]Segmentation fault

#403 yhyub opened 3 months ago
0
解决运行错误

#402 yhyub opened 3 months ago
1
Merging two mistral based models with different architectures. Looking for some guidance.

#401 AshD opened 3 months ago
1
Example of a config file for task_arithmetic 'negative' operation and a case for 'Task analogies'

#400 eunbin079 opened 3 months ago
1
Working Example of the Mergkit-Evo

#399 nthangelane opened 3 months ago
0
passthrough merge error: Tensor model.layers.86.self_attn.k_norm.weight required but not present in model mistralai/Mistral-Large-Instruct-2407

#398 AshD closed 3 months ago
2
MergeKit GUI not working.

#397 Abdulhanan535 closed 3 months ago
0
Support for Phi-3-Small [Feature ?]

#396 hammoudhasan opened 3 months ago
0
Error at MoE Qwen 1.5B

#395 ehristoforu closed 3 months ago
3
Null vocab_file Issue with mistral v03 based models when using union tokenizer source

#394 guillermo-gabrielli-fer opened 3 months ago
2
Is there a way to run LORA extraction using multi GPU? 70B LORA extraction OOM on 24GB 3090Ti

#393 Nero10578 opened 3 months ago
4
Example case of task_arithmetic needed

#392 Opdoop opened 3 months ago
1
MoE exits itself after expert prompts 100% 2/2

#391 SameedHusayn opened 3 months ago
0
mergekit saves tied and ignored weights unlike what transformers does when saving

#390 nyxkrage opened 3 months ago
0
Create Communication Channels for MergeKit

#389 aditya-cherukuru opened 3 months ago
0
The speed issue with the GTATask.

#388 daidaiershidi opened 3 months ago
3
ABM corrections

#387 metric-space opened 4 months ago
0
How to Create a New Merging Method

#386 Guozhenyuan opened 4 months ago
1
Result of merging 2 Gemma2 9B models gains 1B parameters somehow

#385 jim-plus closed 3 months ago
6
does not appear to have a file named config.json

#383 bxf1001 opened 4 months ago
2
Added support for DeepseekV2 model

#382 aditya-29 opened 4 months ago
3
RuntimeError: Unsupported architecture BertForSequenceClassification

#381 lrsbrgrn opened 4 months ago
0
[request] Llama 3.1 Support

#380 discordianbelle closed 4 months ago
1
mergekit-moe支持qwen吗？

#379 hoooooli opened 4 months ago
3
Questions about Config

#378 Zheng-Jay opened 4 months ago
2
mergekit-evolve doesn't account for higher_is_better: false tasks.

#377 mekaneeky opened 4 months ago
1
merging native pytorch model locally

#376 sorobedio opened 4 months ago
1
Network is unreachable

#375 guanfaqian closed 4 months ago
1
evolve-merge installation argument not working

#374 sorobedio opened 4 months ago
0
Question about the merge of the Dare method.

#373 guanfaqian opened 4 months ago
0
Examples needed.

#372 0wwafa opened 4 months ago
3
Awesome repo, but can it convert multiple architectures to Llama?

#371 BBC-Esq opened 4 months ago
0
remove strict version of pydantic

#370 sreev closed 1 month ago
1
RuntimeError: Unsupported architecture Qwen2ForSequenceClassification

#369 iseesaw closed 4 months ago
3
duplicate folders labelled 'mergekit' found in mergekit install

#368 jpeek34556 opened 4 months ago
0
Specify chat template for output model

#367 cg123 closed 4 months ago
0
Add Della merge method

#366 Tej-Deep closed 4 months ago
6
Activation based merging - copied over from wip-zipit branch

#365 metric-space closed 4 months ago
0
gracefully pause evolutionary optimization?

#364 johnwee1 opened 4 months ago
1
Support for fine-grained experts in MoE models

#363 misdelivery opened 4 months ago
0
Add support for Internlm2

#362 Crystalcareai closed 4 months ago
2
Thera are still some problems with moe merge qwen with other LLM(like llama,deepseek,etc)

#361 aoyinke opened 4 months ago
3
Condense a models layers.

#360 DewEfresh opened 5 months ago
1
Gemma2 support

#359 cg123 closed 5 months ago
0

Previous Next