Open axrwl opened 10 months ago
I met the same problem when I try to merge Deepseek llama model into Mixtral. https://huggingface.co/deepseek-ai/deepseek-llm-7b-base/tree/main It seems that some tensor key_name are not supported in merge-kit. We can load the tensor and modify the tensor key, It Might work. I wiill give a try later.
@cg123 I have the same issue using 6 different phi-2 models. It would be great it let me know which one of the 6 is having issues.
Support for phi-based models actually hasn't been added to mergekit-moe
yet - I believe @mlabonne used his own customized fork. The mainline mergekit-moe
currently only supports Llama and Mistral models. Sorry for the trouble!
I met the same problem when I try to merge Deepseek llama model into Mixtral. https://huggingface.co/deepseek-ai/deepseek-llm-7b-base/tree/main It seems that some tensor key_name are not supported in merge-kit. We can load the tensor and modify the tensor key, It Might work. I wiill give a try later.
hello, have you already succeeded in the experiment ?
@ZhangEnmao @naseerfaheem @axrwl @Xingxiangrui I was finally able to merge two phi-2 experts, if you are still looking to use the mergekit for this, checkout the phi2xtral branch here: https://github.com/v-prgmr/mergekit/tree/phi2xtral
I am getting the following error when I run
mergekit-moe config.yml ./output
My config file is
except the prompt arrays are not empty. I am on the
mixtral
branch.