microsoft / IRNet

An algorithm for cross-domain NL2SQL
MIT License
264 stars 81 forks source link

Why skip some samples in IRNet_dj/preprocess/sql2SemQL.py ? #13

Closed aifight closed 4 years ago

aifight commented 4 years ago

Nice work! But I'm wonder why do we need to skip samples that meet the conditions "len(datas[i]['sql']['select'][1]) > 5" https://github.com/microsoft/IRNet/blob/72df5c876f368ae4a1b594e7a740ff966dbbd3ba/preprocess/sql2SemQL.py#L382

JasperGuo commented 4 years ago

There are too few examples that select more than 5 columns in the dataset, so we simply ignore these examples.

madcpt commented 4 years ago

There are too few examples that select more than 5 columns in the dataset, so we simply ignore these examples.

I am wondering if there is any 'bad case' in the test set? btw, impressive work you did!

jaydeepb-inexture commented 4 years ago

@aifight have you understood the whole implementation of IRNet? i have some doubts if you can solve it.