arcee-ai / mergekit

Tools for merging pretrained large language models.
GNU Lesser General Public License v3.0
3.99k stars 346 forks source link

Why all inputs to a slice must contain the same number of layers? #354

Closed Mihaiii closed 1 week ago

Mihaiii commented 1 week ago

I was doing some passthrough magic and got hit by this error.

I must admit I'm a little confused by it and it looks like a regression to me. I used passthrough for eliminating layers in February this year - before the PR that contained that assert line was merged - and everything was ok. I'm also trying to understand the code better.

Why all inputs to a slice must contain the same number of layers and did it indeed work without this restriction before (i.e., is the code more rigid now)?

Update: here is the config I used:

slices:
  - sources:
    - model: Mihaiii/Venusaur
      layer_range: [0, 1]
    - model: Mihaiii/Venusaur
      layer_range: [0, 2]
    - model: Mihaiii/Venusaur
      layer_range: [1, 2]
merge_method: passthrough
dtype: float32
Mihaiii commented 1 week ago

Nevermind, my config was wrong. Here is the correct one:

slices:
  - sources:
    - model: Mihaiii/Venusaur
      layer_range: [0, 1]
  - sources:
    - model: Mihaiii/Venusaur
      layer_range: [0, 2]
  - sources:
    - model: Mihaiii/Venusaur
      layer_range: [1, 2]
merge_method: passthrough
dtype: float32