mithrandie / csvq

SQL-like query language for csv
https://mithrandie.github.io/csvq
MIT License
1.49k stars 65 forks source link

Joining of files with different delimiters #64

Closed ondohotola closed 2 years ago

ondohotola commented 2 years ago

Hi,

I have two files, one with variable number of fields, no header and delimited by '|' (psv), and another with two fields, delimited by ',' and with a header (csv).

I would like to join them and have I have experimented a little with

SELECT psv.c17 AS Code, csv.c2 as Reason FROM CSV('|',unl.psv, AUTO, no_header) psv,
CSV(',',rejections.csv) csv WHERE psv.c17 = csv.c1

but is squeals

invalid argument for CSV: cannot be converted as a no-header value: no_header

I can of course use --no-header in the command line and it works in my example because csv.c2 can not be equal to psv.c17, but I wonder if this was possible.

greetings, el

mithrandie commented 2 years ago

The fourth no_header argument of the CSV function needs to be passed as a boolean. The following query may work as expected.

SELECT psv.c17 AS Code,
       csv.c2 AS Reason
  FROM CSV('|',unl.psv, AUTO, TRUE) psv,
       CSV(',',rejections.csv) csv
 WHERE psv.c17 = csv.c1

When the --no-header option is specified, the header line of rejections.csv is read as the first record, and it is assumed that the header line was not included in the result-set because there was no line in the unl.psv where c17 matched the value of c1 in the header line. However, this is just a coincidence and such execution methods do not always return expected results.