mithrandie / csvq

SQL-like query language for csv
https://mithrandie.github.io/csvq
MIT License
1.5k stars 65 forks source link

Regular Expressions support? #44

Closed blurayne closed 3 years ago

blurayne commented 3 years ago

I could not find anything. While I really like the tool this is really a pain point for me.

Is there support planned?

mithrandie commented 3 years ago

Csvq does not provide, nor does it plan to provide, any features for regular expressions, because I didn’t need it🙄

If there are any requests, I will consider them.

blurayne commented 3 years ago

BTW Awesome project – considering the commit/contribution history ;)

I digged into textql, q, octosql, even clickhouse-local and other solutions but so far csvq says my favorite. For what I regularly use it is querying data via CLI, putting into into CSV and then querying it. I also didn't need REGEXP before until yesterday.

Also CALL() is a nyce function I recently discovered.

ondohotola commented 3 years ago

Markus,

you can find ways around the lack of regex, by piping the first line (headers) and the regex'ed lines into

a|b|cavq 'select some fields from STDIN'

el

On 2021-03-05 10:03 , Markus Geiger wrote:

BTW Awesome project – considering the commit/contribution history ;)

I digged into textql, q, octosql, even clickhouse-local and other solutions but so far csvq says my favorit. But for what I regularly use it is querying data via CLI, putting into into CSV and then querying it. I also didn't need REGEXP before until yesterday.

Also CALL() is a nyce function I recently discovered.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mithrandie/csvq/issues/44#issuecomment-791237895, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACVZVGSPWT5XNPWYBKJ3BPLTCCF57ANCNFSM4YT3TTMQ.

-- Dr. Eberhard W. Lisse \ / Obstetrician & Gynaecologist el@lisse.NA / * | Telephone: +264 81 124 6733 (cell) PO Box 8421 Bachbrecht \ / If this email is signed with GPG/PGP 10007, Namibia ;____/ Sect 20 of Act No. 4 of 2019 may apply

blurayne commented 3 years ago

@ondohotola Not always. But a viable solution for 90%. Think about a dynamically build regex like WHERE field_a REGEXP field_b||"-[a-f0-9]+$"

I thought about using the CALL() function but it's a hacky and imperformant solution. Just wished my go lang skills were a bit better ;)

ondohotola commented 3 years ago

You are aware that this is a tool to manipulate CSV files?

If you need to do serious SQL, just dump the CSV into SQlite3.

el

On 2021-03-05 12:16 , Markus Geiger wrote:

@ondohotola https://github.com/ondohotola Not always. But a viable solution for 90%. Think about a dynamically build regex like |WHERE field_a REGEXP field_b||"-[a-f0-9]+$" |

I thought about using the CALL() function but it's a hacky and imperformant solution. Just wished my go lang skills were a bit better ;) [...] -- Dr. Eberhard W. Lisse \ / Obstetrician & Gynaecologist el@lisse.NA / * | Telephone: +264 81 124 6733 (cell) PO Box 8421 Bachbrecht \ / If this email is signed with GPG/PGP 10007, Namibia ;____/ Sect 20 of Act No. 4 of 2019 may apply

blurayne commented 3 years ago

@ondohotola No need to do serious SQL or write CSV ;)

derekmahar commented 3 years ago

If you need to do serious SQL, just dump the CSV into SQlite3.

@blurayne, this answer on Stack Overflow describes in detail how to use REGEXP in SQLite.

You can use the sqlite3 command .import to import a CSV file into an SQLite table.

blurayne commented 3 years ago

Thanks, but if I wanted to use another tool with sqlite3 i'd had a look here: https://github.com/dbohdan/structured-text-tools

Dolt and ClickHouse-local are also good options.

blurayne commented 3 years ago

Indeed I used q as temp workaround. But I prefer compact sized static bins ;)

derekmahar commented 3 years ago

Thanks, but if I wanted to use another tool with sqlite3 i'd had a look here: https://github.com/dbohdan/structured-text-tools

I agree that this list contains many useful CSV processing tools including many that use SQLite under the hood. In fact, I think this list is where I first found csvq and Miller, another tool for processing data in CSV files.

Note, however, that no matter which solution you use with SQLite, that tool would still need to provide an extension to use the SQL REGEXP operator. SQLite does not include this functionality by default. The solution on Stack Overflow which I suggested simply installs an Ubuntu package that implements Perl regular expressions as a tiny C loadable module that SQLite can load to provide user function regexp() for operator REGEXP. As another example which instead uses Python, the Python sqlite3 module, and SQLite, see this solution.

derekmahar commented 3 years ago

Indeed I used q as temp workaround. But I prefer compact sized static bins ;)

I think sqlite3 and package sqlite3-pcre together would be the most compact and SQL-compliant solution. It's not a single static binary, but no such SQLite binary exists and most other tools don't provide an SQL engine that implements as many modern SQL features as do SQLite and csvq.

derekmahar commented 3 years ago

Having said this, I agree with you that it would be nice if csvq supported the REGEXP operator or a similar regular expression function. It would give csvq a slight advantage over SQLite for doing basic CSV processing.

derekmahar commented 3 years ago

@blurayne By the way, Miller has a regular expression matching operator.

mithrandie commented 3 years ago

For now, the following regular expression functions have been added in version 1.15.0.