Open varunlmxd opened 6 months ago
I can definitely add support for merging OPT models with OPT models if that's useful for you.
As far as merging an OPT model with a Mistral or Llama model goes, there currently isn't a way to do that. OPT uses a different activation function, positional encoding, and MLP section structure (straight feedforward vs. Llama's GLU). It might be possible someday but there's nothing I'm aware of that will work today.
Thank you @cg123 for responding, if there is way to merge OPT with OPT then please add that support.
Hey @cg123 is there any way to merge OPT?
I am trying to merge OPT architecture with Mistral 7B model and got this error is there any way to merge OPT models with Mistral or Llama architecture Error: Traceback (most recent call last): File "/usr/local/bin/mergekit-yaml", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in call
return self.main(args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
return __callback(args, *kwargs)
File "/content/mergekit/mergekit/options.py", line 78, in wrapper
f(args, **kwargs)
File "/content/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
run_merge(
File "/content/mergekit/mergekit/merge.py", line 45, in run_merge
model_arch_info = [
File "/content/mergekit/mergekit/merge.py", line 46, in
get_architecture_info(m.config(trust_remote_code=options.trust_remote_code))
File "/content/mergekit/mergekit/architecture.py", line 362, in get_architecture_info
raise RuntimeError(f"Unsupported architecture {arch_name}")
RuntimeError: Unsupported architecture OPTForCausalLM