Closed Jacobsolawetz closed 5 months ago
%%writefile -a newmerge.yaml slices: - sources: - model: psmathur/orca_mini_v3_13b layer_range: [0, 40] - model: garage-bAInd/Platypus2-13B layer_range: [0, 40] merge_method: slerp base_model: psmathur/orca_mini_v3_13b parameters: t: - filter: self_attn value: [0, 0.5, 0.3, 0.7, 1] - filter: mlp value: [1, 0.5, 0.7, 0.3, 0] - value: 0.5 # fallback for rest of tensors dtype: float16
arcee.mergekit_yaml("new_merge3", "newmerge.yaml")
arcee.mergekit_evolve( merging_name="dummy_merging16", arcee_pretrained_models=["sliced-slack-2"], arcee_aligned_models=[], arcee_merged_models=[], hf_models=["arcee-ai/mistral-sliced"], arcee_eval_qa_set_names_and_weights=[{"tax_questions": 0.5}, {"tax_questions": 1}], general_evals_and_weights=[{"agieval_lsat_ar": 1}], base_model="arcee-ai/mistral-sliced", merge_method="ties", time_budget_secs=864000, wandb_key="cacf018fbe82e6e1a6f3174064f1c71a810aae5c", target_compute="p5.48xlarge", capacity_id="cr-0190d9df55a87083b" )
would be the public api you could call with 100k pairs. It will chunk into 2k chunks and call:
@tleyden this sounds good but going to suggest out of scope for this PR