Reproducing defect results

Dear CodeT4 team,

Let me to thank you for sharing your dataset and models with us. I have reproduced the defect experiment with batch size = 8 in four GPUs so that the batch size is 32. In file exp_with_args.sh, I modified the CUDA_VISIBLE_DEVICES=0,1,2,3.

In your paper, you reported the accuracy is %65.78, and I reproduced the same experiment and got %64.09. I am not sure what is the problem in my experiment, and I appreciate any help to reproduce the same results that you got.

Training: [0] Best acc changed into 0.6175 [1] Best acc changed into 0.6482 [2] Best acc changed into 0.6552 [3] Best acc changed into 0.6654 [6] Early stop as not_acc_inc_cnt=3

[best-acc] test-acc: 0.6409 [best-acc] test-acc: 0.6409

Testing: accuracy_score 64.0922401171303 precision_score 67.08229426433915 recall_score 42.86852589641435 f1_score 52.309188138065146 [[1213 264] [ 717 538]]

salesforce / CodeT5

Reproducing defect results #54