Cannot reproduce the IRNet+BERT results

microsoft / IRNet

An algorithm for cross-domain NL2SQL

MIT License

264 stars 81 forks source link

Cannot reproduce the IRNet+BERT results #33

Open saparina opened 4 years ago

saparina commented 4 years ago

Thank you for sharing the IRNet+bert model. The code is very useful, but I can not reproduce the results from the paper (61.9% dev acc).

Testing the pre-trained model downloaded from #2, I got Acc: 0.572957, Beam Acc: 0.598249 using your eval and 0.572 EM accuracy using the official evaluation script.

Testing the model trained from scratch (bert branch), I got similar results: Acc: 0.564202, Beam Acc: 0.600195 using your eval and 0.573 EM accuracy using the official script.

Do you have similar results from eval.py script? What is the accuracy of the released model?

brunnurs commented 4 years ago

I'm not part of the IRNet team, so I might be the wrong person to answer your questions :-) But I enhanced the IRNet model myself with a BERT encoder (as I needed it before the IRNet team released the code) and I never got over 59% dev accuracy. Not sure how to reach this 2% extra .

slamandar commented 4 years ago

I'm not part of the IRNet team, so I might be the wrong person to answer your questions :-) But I enhanced the IRNet model myself with a BERT encoder (as I needed it before the IRNet team released the code) and I never got over 59% dev accuracy. Not sure how to reach this 2% extra .

Hi, do you reproduce using the code in the bert branch or reproduce code by yourself? I tried the code in bert branch provided by authors, but a field named 'col_pred' lacked in the processed data files, and there is no hints about the usage.

brunnurs commented 4 years ago

I implemented a BERT encoder myself (https://github.com/brunnurs/valuenet)

saparina commented 4 years ago

Firstly, I reproduced by myself (before the bert branch release), but the result was worse than in the paper (about 57%). Then I downloaded the model and code released by the authors and got the result I mentioned.

Recently I found that eval.sh script lacks --column_pointer option. With this argument, the result was much better (~60-61%).

@slamandar I used already preprocessed files from the data folder. However, I also confused about the field 'col_pred' because I don't understand where it comes from. I guess this filed somehow restricts the column set, so it would be great to see the preprocessing scripts.

saparina commented 4 years ago

@brunnurs did you reproduce IRNet+BERT results in your version? The abstract of your paper sounds very interesting, will read it :)