arcee-ai / arcee-python

The Arcee client for executing domain-adpated language model routines https://pypi.org/project/arcee-py/
https://www.arcee.ai
25 stars 5 forks source link

Merge wiring #41

Closed Jacobsolawetz closed 5 months ago

Jacobsolawetz commented 5 months ago
%%writefile -a newmerge.yaml
slices:
  - sources:
      - model: psmathur/orca_mini_v3_13b
        layer_range: [0, 40]
      - model: garage-bAInd/Platypus2-13B
        layer_range: [0, 40]
merge_method: slerp
base_model: psmathur/orca_mini_v3_13b
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
dtype: float16
arcee.mergekit_yaml("new_merge3", "newmerge.yaml")
arcee.mergekit_evolve(
    merging_name="dummy_merging16",
    arcee_pretrained_models=["sliced-slack-2"],
    arcee_aligned_models=[],
    arcee_merged_models=[],
    hf_models=["arcee-ai/mistral-sliced"],
    arcee_eval_qa_set_names_and_weights=[{"tax_questions": 0.5}, {"tax_questions": 1}],
    general_evals_and_weights=[{"agieval_lsat_ar": 1}],
    base_model="arcee-ai/mistral-sliced",
    merge_method="ties",
    time_budget_secs=864000,
    wandb_key="cacf018fbe82e6e1a6f3174064f1c71a810aae5c",
    target_compute="p5.48xlarge",
    capacity_id="cr-0190d9df55a87083b"
)
Jacobsolawetz commented 5 months ago

would be the public api you could call with 100k pairs. It will chunk into 2k chunks and call:

@tleyden this sounds good but going to suggest out of scope for this PR