yule-BUAA / MergeLM

Codebase for Merging Language Models (ICML 2024)
745 stars 42 forks source link

Is there merged model available for download? #6

Closed kexul closed 9 months ago

kexul commented 9 months ago

Hi, thanks for the great work! Is there merged model available in huggingface?

yule-BUAA commented 9 months ago

Hello,

Thanks for your interest in our work!

Could you please tell me which merged models you want to download? I can upload them accordingly in huggingface.

kexul commented 9 months ago

Maybe the wizardlm series?

Personally I'd like to have a model with wizardlm and wizardcode merged. Maybe we could call the lengendary TheBloke to quantize it then.

Many thanks!

yule-BUAA commented 9 months ago

Hi,

I have tried to upload the checkpoints to HuggingFace but it failed many times due to the network connection issue. (XoX)

Could you please run the following command to obtain the checkpoint that you want?

python merge_llms_instruct_math_code.py --merge_instruct --merge_code --merging_method_name mask_merging --use_weight_rescale --weight_mask_rate 0.3 --mask_apply_method task_arithmetic --scaling_coefficient 1.0 --tensor_parallel_size 1

The above command only requires CPUs with about 470GB of memory space. Note that if you want to save the checkpoint, please comment out this line since our code automatically deletes the checkpoint after evaluation.

Moreover, since existing model merging methods assume the models to be merged are fine-tuned from the same architecture, the code model we merge is llama-2-13b-code-alpaca instead of WizardCoder-Python-13B as WizardCoder-Python-13B is fine-tuned based on Code Llama rather than Llama 2.

Please feel free to ask if there are any further questions.

kexul commented 9 months ago

470GB of memory space Sorry, that's not what I can afford as an end user! 😭

yule-BUAA commented 9 months ago

OK. Now I am trying to upload the checkpoint to Baidu Wangpan. I will share the link after the uploading is completed.

ramkumarkoppu commented 9 months ago

Is this 470GB disk space or RAM?

yule-BUAA commented 9 months ago

470GB of memory space Sorry, that's not what I can afford as an end user! 😭

Hi, I have uploaded the checkpoints to Baidu Wangpan. Note that we respectively store the merged checkpoint for instruction-following and code-generating models due to their difference in the tokenizer configurations. But their parameters are exactly identical.

The merged checkpoint for the instruction-following task: Link:https://pan.baidu.com/s/1thtOAGeHlCOZSFvcXgl6hQ Extraction code:zykq

The merged checkpoint for the code-generating task: Link:https://pan.baidu.com/s/1mkC3GobfqUbKXqTvY1QCzw Extraction code:ccu0

I hope this will help address your issue. ^_^

yule-BUAA commented 9 months ago

Is this 470GB disk space or RAM?

It uses 470GB RAM.

The disk space would be the same as the pre-trained backbone takes.

yule-BUAA commented 9 months ago

Hi, guys.

Close this issue now.

Please feel free to reopen it when there are any further questions.