jqnatividad / qsv

Blazing-fast Data-Wrangling toolkit
https://qsv.dathere.com
The Unlicense
2.51k stars 71 forks source link

`stats`: does not handle directories with embedded spaces correctly when creating stats cache files #2294

Open jqnatividad opened 3 hours ago

jqnatividad commented 3 hours ago

Hi @jqnatividad, I have made an interesting observation. I use the MS Windows command line (cmd.exe). The command does work if there are no spaces in the directory name, e,g. C:\test\qsvtest

C:\test\qsvtest>type file.csv
col1,col2,col3
a,b,c
d,e,f
C:\test\qsvtest>qsv tojsonl file.csv -o file.jsonl
Enum list generated for field 'col1' (2 value/s)
Enum list generated for field 'col2' (2 value/s)
Enum list generated for field 'col3' (2 value/s)

C:\test\qsvtest>type file.jsonl
{"col1":"a","col2":"b","col3":"c"}
{"col1":"d","col2":"e","col3":"f"}

However, it does not work if there are spaces in the directory name, e.g. C:\test\qsv test

C:\test\qsv test>type file.csv
col1,col2,col3
a,b,c
d,e,f
C:\test\qsv test>qsv tojsonl file.csv -o file.jsonl
Failed to infer field types: qsv stats exited with code: 2

So this is an argument parsing issue, at least on MS Windows. Does it work on Linux?

Originally posted by @datatraveller1 in https://github.com/jqnatividad/qsv/discussions/2290#discussioncomment-11279780

jqnatividad commented 3 hours ago

Good catch @datatraveller1! It's a bug in stats and currently fixing it...