niklaswretblad / the-effects-of-noise-in-text-to-SQL

Code for the paper "Understanding the Effects of Noise in Text-to-SQL: An Examination of the BIRD-Bench Benchmark".
MIT License
14 stars 1 forks source link

Request of releasing the correct data for domains besides Financial #1

Closed nutor closed 4 months ago

nutor commented 4 months ago

Hi authors,

I found that only the corrected Financial field data are released. I wonder if we can access the data you annotated for the domains besides the Financial domain? I know that only partial representative (saying that 20) random samples have been annotated, but I believe that getting them is still something meaningful, to take advantage of your valuable efforts and insights.

I really like this work, which helps to upgrade the quality and reliability of a good Text2SQL bench, BIRD. With the help of the contribution made by this work, we can advance Text2SQL field with clearer guidance without being affected by noise. Though Financial domain data made effects to prove the conclusion, I hope we can have more information and take better advantage of your findings.

Thanks a lot.

niklaswretblad commented 4 months ago

Hi nutor! Thank you for your interest and your positive feedback!

I very much agree with you and sorry for not doing it sooner! I just simply haven't had the time to prepare the excel sheets containing the other annotations for a public upload yet (they currently contain our annotations and notes in Swedish which needs to be translated). But if you give me a week or two I will try to make it happen!

Thanks again for the kind feedback! I will let you know here on github once I've uploaded the other annotations!

nutor commented 4 months ago

Hi nutor! Thank you for your interest and your positive feedback!

I very much agree with you and sorry for not doing it sooner! I just simply haven't had the time to prepare the excel sheets containing the other annotations for a public upload yet (they currently contain our annotations and notes in Swedish which needs to be translated). But if you give me a week or two I will try to make it happen!

Thanks again for the kind feedback! I will let you know here on github once I've uploaded the other annotations!

Many thanks for your prompt reply.

No hurry. Please take your time to process the data. I will wait for your notification when those data are ready.

Congrats to your acceptance to ACL and thanks for your efforts again!

nutor commented 4 months ago

Hi @niklaswretblad, I also found some issues with the code. The code in run_model.py seems incompatible with the code in src/datasets.py. For example, in run_model.py execute_queries_and_match_data API was given 3 arguments while in datasets.py this API only takes two arguments. Is there anything wrong in the file version? Thanks a lot!