questdb / questdb

QuestDB is an open source time-series database for fast ingest and SQL queries
https://questdb.io
Apache License 2.0
14.36k stars 1.15k forks source link

REST API /exp CSV escaping incompatible with MS Excel #2004

Open amunra opened 2 years ago

amunra commented 2 years ago

Describe the bug

If string/symbol columns tabs or new-line characters these are escaped in a manner that is similar to JSON escaping.

Today:

curl -G --data-urlencode "query=select * from table5 limit 5" http://localhost:9000/exp
"a","b","c"
10.0,"a\"\nb\rc\r\nd\"' 3",
20.5," a\t",
,,x

Our current escaping isn't compatible with MS Excel.

What we want:

For point of comparison, we should do whatever the csv module in Python's standard library does as it mirrors Excel closely from a compatibility point of view.

>>> import csv
>>> data = [['double-quote', 'new line', 'tab', 'windows new line'],['"', '\n', '\t', '\r\n']]
>>> import io
>>> output = io.StringIO()
>>> csv.writer(output, dialect='excel').writerows(data)
>>> print(output.getvalue())
double-quote,new line,tab,windows new line
"""","
",      ,"
"

We could also consider avoiding double-quoting strings unless required to avoid bloat.

To reproduce

  1. Insert some data with new lines, windows new lines, tabs, etc.
  2. Query it from /exp (or just via the web console as CSV).
  3. Observe incompatible escaping.

Expected Behavior

Once fixed, the output - saved as a .csv file - should open cleanly in excel, including multi-value strings.

Environment

- **QuestDB version**: 6.2.1
- **OS**: all
- **Browser**: any

Additional context

No response

argshook commented 2 years ago

I'm here just to leave these links:

if anything, QuestDB should probably follow some spec, instead of accomodating specific implementation of MS Excel or similar.