mithrandie / csvq

SQL-like query language for csv
https://mithrandie.github.io/csvq
MIT License
1.5k stars 65 forks source link

Delimited field names seems broken #47

Closed runeimp closed 3 years ago

runeimp commented 3 years ago

CSVs I get typically have field names with spaces in them. Delimiting the fields references with double quotes kills errors but I always get an Empty RecordSet. I also tried using backticks (ala MySQL), square brackets (ala SQL Server), and single quotes. Double quotes was the only one that didn't generate an error. But the SELECT query I tried always fails. If I replaces spaces in the header of the CSV and in the query with underscores everything works as expected. If underscores is a hard requirement then why don't double quotes generate an error? Or is this just a bug? BTW, love this tool! THANK YOU for creating it! 😃

ondohotola commented 3 years ago

If you use backticks from the command line you need to either escape them

    ... | csvq “select  `field one` from stdin”

or use single quotes

    ... | csvq ‘select field one from stdin’

to prevent the shell from expanding them.

fi you use csvq interactively, the whole command needs no quotes and the backticks do not need to be escaped (but then of course STDIN doesn’t work).

No drama.

el

On 13 Mar 2021, 02:07 +0200, RuneImp @.***>, wrote:

CSVs I get typically have field names with spaces in them. Delimiting the fields references with double quotes kills errors but I always get an Empty RecordSet. I also tried using backticks (ala MySQL), square brackets (ala SQL Server), and single quotes. Double quotes was the only one that didn't generate an error. But the SELECT query I tried always fails. If I replaces spaces in the header of the CSV and in the query with underscores everything works as expected. If underscores is a hard requirement then why don't double quotes generate an error? Or is this just a bug? BTW, love this tool! THANK YOU for creating it! 😃 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

mithrandie commented 3 years ago

By default, values enclosed in backticks are interpreted as identifiers, such as field names or file names, and values enclosed in double or single quotes are interpreted as strings.

If the --ansi-quotes option is specified, values enclosed in double quotes are interpreted as identifiers, as are values enclosed in backticks.

https://mithrandie.github.io/csvq/reference/statement.html#parsing

$ csvq 'SELECT * FROM `sample.csv`'
+---------+---------+
| column1 | column2 |
+---------+---------+
| 1       | a       |
| 2       | b       |
+---------+---------+
$ csvq 'SELECT `column1` FROM `sample.csv`'
+---------+
| column1 |
+---------+
| 1       |
| 2       |
+---------+
$ csvq 'SELECT "column1" FROM `sample.csv`'
+---------+
| column1 |
+---------+
| column1 |
| column1 |
+---------+
$ csvq --ansi-quotes 'SELECT "column1" FROM `sample.csv`'
+---------+
| column1 |
+---------+
| 1       |
| 2       |
+---------+

I don't know what kind of SQL you executed, but for example, if you enclosed field names in double quotes when narrowing down by conditions, the result set will be empty.

$ csvq 'SELECT * FROM `sample.csv` WHERE `column1` = 1'
+---------+---------+
| column1 | column2 |
+---------+---------+
| 1       | a       |
+---------+---------+
$ csvq 'SELECT * FROM `sample.csv` WHERE "column1" = 1'
Empty RecordSet

Square brackets available on SQL Server are not supported. The reason for the single quotes error is a mystery to me, but as @ondohotola commented above, it might be that they are not escaped.

Please check the behavior again based on these, and if the problem persists, can you tell me the SQL to reproduce it?

runeimp commented 3 years ago

When I tried the backticks I'm pretty sure I'd wrapped it all in single quotes. But I just tried using backticks again and it worked this time. So maybe not. 👼