ms705 / nom-sql

Rust SQL parser written using nom
MIT License
232 stars 41 forks source link

Queries that contain arbitrary bytes cannot be parsed #33

Closed lovasoa closed 6 years ago

lovasoa commented 6 years ago

SQL files can contain any byte sequence. However, this library only exposes a way to parse a rust string (that is, a sequence of unicode codepoints). This makes it impossible to parse some SQL files (such as the wikipedia dumps I am currently working with), as thay contain byte sequences that are not valid utf-8.

The api should expose a function that takes an &[u8] instead of an &str.

For handling byte sequences that are not valid utf8 in literal strings, I see two possibilities:

ms705 commented 6 years ago

Fixed in #34, I believe?