Open 545999961 opened 7 months ago
Thank you so much for your open source contributions
I used deepseek model version deepseek-ai/deepseek-coder-6.7b-instruct and fintune in bird-sql and spider , evaluating result of finetune mode in bird-dev as following :
is different with your results in bird-leadboard :
I would like to ask, what are the possible reasons why the contents of the paper cannot be reproduced?
I think it's probably due to inadequate training. How many Epochs have you trained?@dshwei
I think it's probably due to inadequate training. How many Epochs have you trained?@dshwei
I set 4 epoch
I think it's probably due to inadequate training. How many Epochs have you trained?@dshwei
how many epoch should be setted
training attributes : lora_r = 64 lora_alpha = 64 lora_dropout = 0.1 output_dir = "./bird_sql_gen" num_train_epochs = 6 bf16 = True overwrite_output_dir = True per_device_train_batch_size = 2 per_device_eval_batch_size = 2 gradient_accumulation_steps = 16 gradient_checkpointing = True evaluation_strategy = "steps" learning_rate = 5e-5 weight_decay = 0.01 lr_scheduler_type = "cosine" warmup_ratio = 0.01 max_grad_norm = 0.3 group_by_length = True auto_find_batch_size = False save_steps = 50 logging_steps = 50 load_best_model_at_end= False packing = False save_total_limit=3 neftune_noise_alpha=5 report_to="wandb" max_seq_length = 2100 #set based on the maximum number of queries
peft_config = LoraConfig( lora_alpha=lora_alpha, lora_dropout=lora_dropout, r=lora_r, target_modules=[ "q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head" ], task_type=TaskType.CAUSAL_LM, )
It doesn't seem to exist a configuration problem in your code. You can contact me by email.
It doesn't seem to exist a configuration problem in your code. You can contact me by email.
OK, taroballscai@hotmail.com This is your email address?
How are the results of DTS-SQL Mistral and DTS-SQL Deepseek on the bird dataset?