arcee-ai / mergekit

Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
4.88k stars 446 forks source link

RuntimeError: Need to specify cache dir to merge adapters #442

Closed Zolilio closed 1 month ago

Zolilio commented 1 month ago

I tried merging multiple adapters together. Here's the yaml I use:

models:
  - model: mistralai/Mistral-Nemo-Instruct-2407+/home/xxxx/AI/LLM/models/LORA/Gutenberg-Doppel_1024/
    parameters:
      weight: 0.3
      density: 0.9
  - model: mistralai/Mistral-Nemo-Instruct-2407+/home/xxxx/AI/LLM/models/LORA/lumamaid_1024/
    parameters:
      weight: 0.25
      density: 0.9
  - model: mistralai/Mistral-Nemo-Instruct-2407+/home/xxxx/AI/LLM/models/LORA/MN-12B-Lyra-v4_1024/
    parameters:
      weight: 0.3
      density: 0.9
  - model: mistralai/Mistral-Nemo-Instruct-2407+/home/xxxx/AI/LLM/models/LORA/Nemo-abliterated_1024/
    parameters:
      weight: 0.1
      density: 0.7
  - model: mistralai/Mistral-Nemo-Instruct-2407+/home/xxxx/AI/LLM/models/LORA/Rocinante_1024/
    parameters:
      weight: 0.2
      density: 0.9
merge_method: della_linear
base_model: /home/xxxx/.cache/huggingface/hub/models--mistralai--Mistral-Nemo-Instruct-2407/snapshots/e17a136e1dcba9c63ad771f2c85c1c312c563e6b
parameters:
  epsilon: 0.05
  lambda: 1
dtype: bfloat16

Problem is, when I use execute this code, the terminal output:

Traceback (most recent call last):
  File "/home/xxxx/AI/LLM/mergekit/venv/bin/mergekit-yaml", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/xxxx/AI/LLM/mergekit/venv/lib/python3.12/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xxxx/AI/LLM/mergekit/venv/lib/python3.12/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/xxxx/AI/LLM/mergekit/venv/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xxxx/AI/LLM/mergekit/venv/lib/python3.12/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xxxx/AI/LLM/mergekit/mergekit/options.py", line 82, in wrapper
    f(*args, **kwargs)
  File "/home/xxxx/AI/LLM/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
    run_merge(
  File "/home/xxxx/AI/LLM/mergekit/mergekit/merge.py", line 78, in run_merge
    loader_cache.get(model)
  File "/home/xxxx/AI/LLM/mergekit/mergekit/io/tasks.py", line 32, in get
    merged = model.merged(
             ^^^^^^^^^^^^^
  File "/home/xxxx/AI/LLM/mergekit/mergekit/common.py", line 93, in merged
    raise RuntimeError("Need to specify cache dir to merge adapters")
RuntimeError: Need to specify cache dir to merge adapters

Where do I have to specify my cache dir ? And why do I have to specify it ?

cg123 commented 1 month ago

Pass the argument --lora-merge-cache [PATH ON DISK]. This is where mergekit will store models with LoRA adapters merged.