microsoft / IRNet

An algorithm for cross-domain NL2SQL
MIT License
269 stars 80 forks source link

Values in the filter field is discarded in the final result. #22

Open tinyHui opened 4 years ago

tinyHui commented 4 years ago

Hi, I tried to test the pre-train model with the original data from Spider. However, I notice that the final result generated by the sem2SQL.py replaces all the filter value by 1. For example when the question is: What is the average, minimum, and maximum age of all singers from France? The result is SELECT avg(T1.Age), min(T1.Age), max(T1.Age) FROM singer AS T1 WHERE T1.Country = 1 Instead of the correct SQL SELECT avg(age) , min(age) , max(age) FROM singer WHERE country = 'France'.

I notice in the transfer function, we put value to None when calling to_str. Just wondering, is there a way to keep the values in the filtering query?

Many thanks.

shubhamk16 commented 4 years ago

hello @tinyHui, exactly where in sem2SQL.py the values are replaced by 1? so that we could change that in code.

harshilpatel548 commented 4 years ago

Is there any solution for this ? I am also facing the same issue.

JasperGuo commented 4 years ago

@tinyHui Do you mean predicting the filter value during decoding? Or just fill in the correct value when transferring SemQL to SQL?

shubhamk16 commented 4 years ago

@JasperGuo how can we add T1.Country = France instead of T1.Country = 1.

SELECT avg(T1.Age), min(T1.Age), max(T1.Age) FROM singer AS T1 WHERE T1.Country = 1 to SELECT avg(age) , min(age) , max(age) FROM singer WHERE country = 'France'


and how can we add the "ON" clause after join clause so that it can be executable querries? I already add DISTINCT in querries.

JasperGuo commented 4 years ago
  1. Add values Our model currently does not predict value for filter. For example, in this case we only predict a fiter rule to "=" and the filter column "Country". The value "France" is not predicted. If you want to fill in the values, you probably need to implement another modules (e.g., Pointer Network) to predict the value.

  2. Add "ON" clause We infer the "FROM" clause based on the shortest path that connects all the tables we predict. To add "ON" clause, we can traverse the shortest path and add an "ON" clause between any adjacent tables.

brunnurs commented 4 years ago

Hi everyone, as I aim to use the IRNet code in a real world project, I'm right now implementing exactly this. I will put it online as soon as I have a working version (which should be by the end of the month). I will keep you updated.

DevanshChoubey commented 4 years ago

Hi @brunnurs,

I am also trying to do the same hope there is a way we can communicate and discuss this.

shubhamk16 commented 4 years ago

@brunnurs is your model handle values also? so to work it as a product, the model should give values also in the predicted query, right?

brunnurs commented 4 years ago

Yes, the goal is to predict queries including values, so you can execute them. I will let you know when I made it work.

shxliang commented 4 years ago

Is there any other model to predict the values?

tinyHui commented 4 years ago

@JasperGuo Thanks for confirming this with us 👍

@brunnurs @shubhamk16 I'm also doing the model to predict the value right now, hope we can exchange some ideas.

shubhamk16 commented 4 years ago

yes sure @tinyHui.we can discuss this. My Mail Id- kothawadeshubham15@gmail.com

brunnurs commented 4 years ago

@tinyHui Good idea! I already have a working model predicting values and was just about to finishing up, unfortunately I got a bit distracted by the COVID19 pandemic and had to work on another project for the last 2 weeks :-/ But lets have a chat, ursin.brunner@gmail.com

brunnurs commented 4 years ago

I finally managed to publish a model based on IRNet incorporating values. Have a look at the code (https://github.com/brunnurs/valuenet) if you need a system predicting values and synthesizing fully fledged queries.

To synthesize queries which can get executed I also had to solve some further issues like e.g. proper JOINs and the DISTINCT-Keyword.