arcee-ai / mergekit

Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
4.72k stars 429 forks source link

RuntimeError: Unsupported architecture OPTForCausalLM #234

Open varunlmxd opened 6 months ago

varunlmxd commented 6 months ago

I am trying to merge OPT architecture with Mistral 7B model and got this error is there any way to merge OPT models with Mistral or Llama architecture Error: Traceback (most recent call last): File "/usr/local/bin/mergekit-yaml", line 8, in sys.exit(main()) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in call return self.main(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke return __callback(args, *kwargs) File "/content/mergekit/mergekit/options.py", line 78, in wrapper f(args, **kwargs) File "/content/mergekit/mergekit/scripts/run_yaml.py", line 47, in main run_merge( File "/content/mergekit/mergekit/merge.py", line 45, in run_merge model_arch_info = [ File "/content/mergekit/mergekit/merge.py", line 46, in get_architecture_info(m.config(trust_remote_code=options.trust_remote_code)) File "/content/mergekit/mergekit/architecture.py", line 362, in get_architecture_info raise RuntimeError(f"Unsupported architecture {arch_name}") RuntimeError: Unsupported architecture OPTForCausalLM

cg123 commented 6 months ago

I can definitely add support for merging OPT models with OPT models if that's useful for you.

As far as merging an OPT model with a Mistral or Llama model goes, there currently isn't a way to do that. OPT uses a different activation function, positional encoding, and MLP section structure (straight feedforward vs. Llama's GLU). It might be possible someday but there's nothing I'm aware of that will work today.

varunlmxd commented 6 months ago

Thank you @cg123 for responding, if there is way to merge OPT with OPT then please add that support.

varunlmxd commented 6 months ago

Hey @cg123 is there any way to merge OPT?