MohammadrezaPourreza / DTS-SQL

This repository contains all the code for the DTS-SQL paper
Apache License 2.0
35 stars 7 forks source link

performance of DTS-SQL in bird #5

Open 545999961 opened 5 months ago

545999961 commented 5 months ago

How are the results of DTS-SQL Mistral and DTS-SQL Deepseek on the bird dataset?

dshwei commented 5 months ago

Thank you so much for your open source contributions

I used deepseek model version deepseek-ai/deepseek-coder-6.7b-instruct and fintune in bird-sql and spider , evaluating result of finetune mode in bird-dev as following : image

is different with your results in bird-leadboard : image

I would like to ask, what are the possible reasons why the contents of the paper cannot be reproduced?

kanseaveg commented 5 months ago

I think it's probably due to inadequate training. How many Epochs have you trained?@dshwei

dshwei commented 5 months ago

I think it's probably due to inadequate training. How many Epochs have you trained?@dshwei

I set 4 epoch

dshwei commented 5 months ago

I think it's probably due to inadequate training. How many Epochs have you trained?@dshwei

how many epoch should be setted

dshwei commented 5 months ago

training attributes : lora_r = 64 lora_alpha = 64 lora_dropout = 0.1 output_dir = "./bird_sql_gen" num_train_epochs = 6 bf16 = True overwrite_output_dir = True per_device_train_batch_size = 2 per_device_eval_batch_size = 2 gradient_accumulation_steps = 16 gradient_checkpointing = True evaluation_strategy = "steps" learning_rate = 5e-5 weight_decay = 0.01 lr_scheduler_type = "cosine" warmup_ratio = 0.01 max_grad_norm = 0.3 group_by_length = True auto_find_batch_size = False save_steps = 50 logging_steps = 50 load_best_model_at_end= False packing = False save_total_limit=3 neftune_noise_alpha=5 report_to="wandb" max_seq_length = 2100 #set based on the maximum number of queries

peft_config = LoraConfig( lora_alpha=lora_alpha, lora_dropout=lora_dropout, r=lora_r, target_modules=[ "q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj", "lm_head" ], task_type=TaskType.CAUSAL_LM, )

kanseaveg commented 4 months ago

It doesn't seem to exist a configuration problem in your code. You can contact me by email.

dshwei commented 4 months ago

It doesn't seem to exist a configuration problem in your code. You can contact me by email.

OK, This is your email address?