mithrandie / csvq

SQL-like query language for csv
https://mithrandie.github.io/csvq
MIT License
1.5k stars 65 forks source link

Dropped support for backslash escaping of delimiter on Windows #5

Closed whunger closed 5 years ago

whunger commented 5 years ago

Csvq is a very interesting piece of software!

I'd like to report that from release 1.4.3 to 1.5.0, with the introduction of the new parsing routine for the delimiter character, it became impossible to use a delimiter definition like "\t" on Windows. This might be because the Windows shell does not use backslash escaping - it does not even have a means of giving a tab character as a process argument other than entering the literal character itself (which can't be done on the command line, and which does not work with 1.5.0 either). Would it be possible to let csvq interpret backslash escaping again, like in 1.4.3, without breaking anything else?

Both types of calling csvq work fine with 1.4.3 on a tab-separated file:

csvq -d "\t" "select * from t limit 10"
csvq -d "<literal tab character>" "select * from t limit 10"

With 1.5.0, both fail.

Best regards - Werner.

mithrandie commented 5 years ago

@whunger Thank you for reporting.

Hmmm....I can run the following command with csvq 1.5.0 or later, on my Windows 10 (Version 1803), on the both Command Prompt and PowerShell.

csvq -d "\t" "select * from t limit 10"

Could you tell me the version of your Windows and the prompt application that you are running.

whunger commented 5 years ago

Thanks for looking into this.

I did a little more testing today, on Windows Server 2008 r2 and Windows 10 1803. Csvq behaves the same in both environments.

To illustrate the problem i attached some test files along with a batch script and its output on the two Windows systems. From what i see, i'm no longer sure if it's only the delimiter parsing or some deeper problem. It might be that the delimiter is set to an empty string, which causes subsequent problems.

tab_testing.zip

mithrandie commented 5 years ago

@whunger

Thanks, the cause was found by the attached files.

This problem is caused by file extension. Csvq determines the file format by it's extension if possible. For example, a file named "test.csv" is processed as CSV and a specified delimiter is used. "test.tsv" is processed as TSV, "test.json" is processed as JSON. And "test.txt" is processed as Fixed-Length format.

However, since ".txt" is used for various files as a file extension, this behavior is obviously a mistake. I'll modify ".txt" to be an extension that does not link to any file format. Please wait to fix.

mithrandie commented 5 years ago

The above modification is done and included in the version 1.8.6.

@whunger Sorry, I did not think deeply about the behavior. Thank you for the detailed information.

whunger commented 5 years ago

The above modification is done and included in the version 1.8.6.

Wow, so quick, thank you very much! I wasn't even aware of the file
extension feature - very useful. -- PGP-Schlüssel mit "Betreff: Send public key" abrufbar.