nlpxucan / WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
9.11k stars 711 forks source link

finetune on WizardCoder-15B-V1.0 #79

Open iawen opened 1 year ago

iawen commented 1 year ago

Hello, I'm going to fine-tune on WizardCoder-15B-V1.0 , do I need this code:

if "starcoder" in model_args.model_name_or_path:
        tokenizer.add_special_tokens(
            {
                "eos_token": DEFAULT_EOS_TOKEN,
                "bos_token": DEFAULT_BOS_TOKEN,
                "unk_token": DEFAULT_UNK_TOKEN,
                "pad_token": DEFAULT_PAD_TOKEN,
            }
        )
ChiYeungLaw commented 1 year ago

If you load the ckpt and tokenizer WizardLM/WizardCoder-15B-V1.0 from Huggingface, I think you can delete this code.

iawen commented 1 year ago

If you load the ckpt and tokenizer WizardLM/WizardCoder-15B-V1.0 from Huggingface, I think you can delete this code.

thx!

What hardware configuration is required to continue fine-tuning on WizardLM/WizardCoder-15B-V1.0? Also, fine-tuning uses the default configuration file: deepspeed_config.json, but throws an error: Not found scheduler!

I added according to the arguments of Starcoder:

{
  "scheduler": {
    "type": "WarmupLR",
    "params": {
      "warmup_min_lr": "auto",
      "warmup_max_lr": "auto",
      "warmup_num_steps": "auto"
    }
  }
}

Isn't it correct?

ChiYeungLaw commented 1 year ago

We met the same error before. We installed deepspeed==0.9.2 and transformers==4.29.2 to fix this error. We train our models with 8 V100-32GB GPUs.

uloveqian2021 commented 1 year ago

We met the same error before. We installed deepspeed==0.9.2 and transformers==4.29.2 to fix this error. We train our models with 8 V100-32GB GPUs.

Whether all weights can be finetune with 8 V100-32G?

iawen commented 1 year ago

How long did it take you to train? Also based on WizardLM/WizardCoder-15B-V1.0?

noobmldude commented 1 year ago

Can the models fit on 4 x V100 16GB GPUs?

Nicolas-Thomazo commented 1 year ago

We met the same error before. We installed deepspeed==0.9.2 and transformers==4.29.2 to fix this error. We train our models with 8 V100-32GB GPUs.

How many epoch did you do and how long was the training time ?

roke22 commented 10 months ago

Can someone put a detailed list of instructions to fine tuning the model? Or maybe link some guide.

Thank you

matt-sharp commented 9 months ago

Can someone put a detailed list of instructions to fine tuning the model? Or maybe link some guide.

Thank you

Yes, please can someone give some guidance on how to fine-tune?