apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.33k stars 3.21k forks source link

[Feature] Export query result supports standard CSV format #6954

Open CUITCHE opened 2 years ago

CUITCHE commented 2 years ago

Search before asking

Description

At present, when exporting CSV data, it is simply separated by fields with specified separator, rather than strictly according to the standard CSV format. For example, if the field contains a comma, Doris will not use the double quotation mark wrapper field when exporting, resulting in wrong columns in the final result set.

For Table(t1) data storage is ['x1,', 'x2', f2, f3]. Exporting text is x1,,x2,f2,f3. Obviously, not what we expected.

Use case

In CSV format, we could add a key-value to PROPERTIES to indicate that the exporting action will be exported by standard CSV format.

The key-value may be "csv.format.standard" = "true"

Example

SELECT * FROM tbl
INTO OUTFILE "hdfs:/path/to/result_"
FORMAT AS CSV
PROPERTIES
(
    "broker.name" = "my_broker",
    "broker.hadoop.security.authentication" = "kerberos",
    "broker.kerberos_principal" = "doris@YOUR.COM",
    "broker.kerberos_keytab" = "/home/doris/my.keytab",
    "column_separator" = ",",
    "line_delimiter" = "\n",
    "max_file_size" = "100MB",
    "csv.format.standard" = "true"
);

Related issues

7552

Are you willing to submit PR?

Code of Conduct

geoffreytran commented 2 months ago

I'm also running into this issue.