ms705 / nom-sql

Rust SQL parser written using nom
MIT License
232 stars 41 forks source link

support MySQL special character escape sequences #23

Open lovasoa opened 6 years ago

lovasoa commented 6 years ago

Hi! Thank you for this great library. I am trying to use it to parse wikipedia dumps in my project wikipedia-externallinks-fast-extraction. Unfortunately, they contain mysql escape characters that are currently not supported by this library.

Unsupported characters

The escape characters are:

\0
\'
\"
\b
\n
\r
\t
\Z
\\
\%
\_

Example

INSERT INTO externallinks VALUES (23481,120102,'http://home.arcor.de/jean-polmartin/aufsaetze/apliut.htm\'','http://de.arcor.home./jean-polmartin/aufsaetze/apliut.htm\'','http://de.arcor.home./jean-polmartin/aufsaetze/apliut.htm\'');
lovasoa commented 6 years ago

SQLite escape sequences don't seem to be supported either. According to the README:

We try to support both the SQLite and MySQL syntax; where they disagree, we choose MySQL. (It would be nice to support both via feature flags in the future.)

So I think :

lovasoa commented 6 years ago

It should not be difficult to implement using nom::escaped_transform

ms705 commented 6 years ago

Good catch -- I actually independently ran into this issue last week (also parsing MySQL dumps) and made a mental note to fix it!

I originally looked at nom::escaped_transform for this, but didn't yet figure out exactly how to use it for this purpose. Looks like you ended up hand-rolling the parse rule instead, probably for good reasons.

I'll check out the PR :+1:

lovasoa commented 6 years ago

I also tried to use nom::escaped_transform but with no success. I think this is because of that part of the documentation:

WARNING: if you do not use the verbose-errors feature, this combinator will currently fail to build because of a type inference error

ms705 commented 6 years ago

I think we have all of them supported now, thanks to @lovasoa's work :+1: