Closed cpkio closed 3 years ago
The command option "–enclose-all(-Q)" has an effect on newly created files specified by the "--out" option and in CREATE TABLE statements, and for existing files, the information determined at the time of loading is used.
$ csvq -d ';' fields test_trgt.csv
Fields in test_trgt.csv
-----------------------------------------------------
Type: Table
Path: /tmp/test_trgt.csv
Format: CSV Delimiter: ';' Enclose All: false
Encoding: UTF8 LineBreak: LF Header: true
Status: Fixed
Fields:
1. Код формы
2. Наименование показателя
3. Единицы измерения
4. Комментарий
As a workaround, you can use ALTER TABLE statement to set ENCLOSE_ALL attribute for the updated file.
$ csvq -d ';' 'update t set t.4 = "+++" from test_trgt t; alter table test_trgt set enclose_all to true;'
8 records updated on "/tmp/test_trgt.csv".
Attributes Updated in test_trgt
-----------------------------------------------------
Path: /tmp/test_trgt.csv
Format: CSV Delimiter: ';' Enclose All: true
Encoding: UTF8 LineBreak: LF Header: true
Commit: file "/tmp/test_trgt.csv" is updated.
$ cat test_trgt.csv
"Код формы";"Наименование показателя";"Единицы измерения";"Комментарий"
"1-МТС";"""░░░░░░ - ░░░░"", ░ ░░░ ░░░░░ ░░░░░░░░░░░░░ ░░░ ░░ ░░░░░░░░░░ ░░░░░░(░░░. ░░░.)";"░░░. ░░░.";"+++"
"1-МТС";"""░░░░░░ - ░░░░"", ░ ░░░ ░░░░░ ░░░░░░░░░░░░░ ░░░ ░░ ░░░░░░░░░░ ░░░░░░(░░.)";"░░.";"+++"
"1-МТС";"""░░░░░░ - ░░░░""; ░░░░░░░░░░ (░░░░░░░░░░) ░░░░░░░░░ ░░░░░░░░░░░░░░░░░░░░ ░░░░░░░, (░░░. ░░░.)";"░░░. ░░░.";"+++"
"1-МТС";"""░░░░░░ - ░░░░""; ░░░░░░░ ░░░░░ ░ ░░░ (░░░░░░░░░) ░░ ░░░░░ ░░░░, (░░.)";"░░.";"+++"
"1-МТС";"""░░░░"" ░ ░░░░░░ ░░░░░░░░░░░░░, ░ ░░░ ░░░░░ ░░░░░░░░░░░░░ ░░░ ░░ ░░░░░░░░░░ ░░░░░░(░░░. ░░░.)";"░░░. ░░░.";"+++"
"1-МТС";"""░░░░"" ░ ░░░░░░ ░░░░░░░░░░░░░, ░ ░░░ ░░░░░ ░░░░░░░░░░░░░ ░░░ ░░ ░░░░░░░░░░ ░░░░░░(░░.)";"░░.";"+++"
"1-МТС";"""░░░░"" ░ ░░░░░░ ░░░░░░░░░░░░░; ░░░░░░░░░░ (░░░░░░░░░░) ░░░░░░░░░ ░░░░░░░░░░░░░░░░░░░░ ░░░░░░░, (░░░. ░░░.)";"░░░. ░░░.";"+++"
"1-МТС";"""░░░░"" ░ ░░░░░░ ░░░░░░░░░░░░░; ░░░░░░░ ░░░░░ ░ ░░░ (░░░░░░░░░) ░░ ░░░░░ ░░░░, (░░.)";"░░.";"+++"
When saving a file with ENCLOSE_ALL set to false, the values in the file are automatically enclosed only if its value contains a character specified as a delimiter. Normally this is not a problem. However, if a value starts with a double quote, the value is assumed to be enclosed in double quotes, which causes a problem with the interpretation of the csv.
I will fix to automatically enclose values containing double-quotes.
I have found out that quote symbol as a first symbol of a field can break the reader, if this quote is not a field quoting, but an ordinary quote. Tested with v1.13.1.
The source data is:
This is a semicolon-separated csv. I've replaced the text with blocks for convinience.
Let's update this file
test_trgt.csv
:The file will become this:
As you can see, on the second line the field quotes and double quotes (as quoted quotes) have been removed.
Let's try to read this resulting file:
This will result an error:
[L:1 C:15] data parse error in file … test_trgt.csv: line 2, column 21: unexpected " in field
.I have tried any variants of
-k
and-Q
options, but cannot make it read again correctly. The file is correct, field does not need to be quoted, but quote mark as a first field symbol seems to be breaking it.