nlpxucan / WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
9.06k stars 709 forks source link

Finetune From WizardLM/WizardCoder-15B-V1.0, but No effect #92

Open iawen opened 1 year ago

iawen commented 1 year ago

Hi, hello, I made fine-tuning based on WizardLM/WizardCoder-15B-V1.0, I trained the machine to be 8*V100 32G, trained for 22 hours, and then tested with checkpoint 1600

But the effect is very unsatisfactory, as if the model does not do any reasoning at all, can I help see what the problem is?

Below is an instruction that describes a task, paired with an input that provides further context. 
Write a response that appropriately completes the request.

### Instruction:
只生成public方法的单元测试代码,要求代码覆盖率100%

### Input
public class TransactionGlobalServiceImpl implements TransactionGlobalService {
    @Autowired
    private TransactionGlobalMapper transactionGlobalMapper;

    @Override
    public TransactionGlobal queryTransactionGlobal(String bPartnerId) {
        QueryWrapper wrapper = new QueryWrapper();
        wrapper.in("bpartner_id",bPartnerId);
        wrapper.in("is_deleted",0);
        wrapper.orderByAsc("create_date");
        List<TransactionGlobal> transactionGlobals = transactionGlobalMapper.selectList(wrapper);
        if(!CollectionUtils.isAnyEmpty(transactionGlobals)){
            return transactionGlobals.get(0);
        }else{
            return null;
        }
    }
}

### Response:<|endoftext|>

The training script I used is as follows:

deepspeed train_wizardcoder.py \
    --model_name_or_path "/data/models/WizardLM/WizardCoder-15B-V1.0" \
    --data_path "/data/datasets/java_wizard" \
    --output_dir "/data/models/wizard_java_from_starchat" \
    --num_train_epochs 3 \
    --model_max_length 2048 \
    --per_device_train_batch_size 4\
    --per_device_eval_batch_size 1 \
    --gradient_accumulation_steps 2 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 50 \
    --save_total_limit 2 \
    --learning_rate 2e-5 \
    --warmup_steps 30 \
    --logging_steps 2 \
    --lr_scheduler_type "cosine" \
    --gradient_checkpointing True \
    --deepspeed configs/deepspeed_config.json \
    --fp16 True 2>&1 | tee /data/logs/deep.log

The trained dataset is as follows: image

Neither input nor output have line breaks, which affects the effect of finetune??!

ChiYeungLaw commented 1 year ago

This is a strange problem. I think you can do a small experiment to fine-tune the model with Code Alpaca and check whether the same problem exists. Maybe just fine-tune 1 epoch, 512 seqlen.

chen-lee-li commented 1 year ago

This is a strange problem. I think you can do a small experiment to fine-tune the model with Code Alpaca and check whether the same problem exists. Maybe just fine-tune 1 epoch, 512 seqlen.

How long would it take to fine-tune on an 8*A100 40G machine? I have 78,000 rows of data?