taoyds / spider

scripts and baselines for Spider: Yale complex and cross-domain semantic parsing and text-to-SQL challenge
https://yale-lily.github.io/spider
Apache License 2.0
812 stars 193 forks source link

Make tokenization respect escaped quotes. #77

Open hXtreme opened 2 years ago

hXtreme commented 2 years ago

Note: SQL usually use single quotes for strings but the old implementation allowed both single and double quoted string as well as a single quoted string to end with a double quote.

This implementation keeps this behavior but allows for quotes to be escaped.

For example strings like "O\'Reilly" won't cause assertion failure and will be tokenized correctly.

hXtreme commented 2 years ago

Please let me know if you have any questions or concerns about the pr.

hXtreme commented 2 years ago

Hi just pinging this.