Open bkkgbkjb opened 4 years ago
after a further comparison to Python 3.x
csv
library,
I find following table:
Python:
new_line: \r\n
\r -> quote
\n -> quote
\r\n -> quote
new_line: \n
\n -> quote
\r -> no_quote
\r\n -> quote
Go:
new_line: \r\n
\n -> changed to \r\n, then quote (1)
\r -> removed \r, then quote remaining (2)
\r\n -> quote
new_line: \n
\n -> quote
\r -> quote
\r\n -> quote
though there seem no good standard on csv
format, I still think touching actual data is a bad idea
My suggestion will be simply fix (1)
, (2)
to quote
then all the \r?\n?
occurrence would be quoted, which never harms
/cc @dsnet @bradfitz
The issue reported seems like surprising behavior to me. I wouldn't expect data to be changed either.
The godoc currently documents the behavior:
The Reader converts all \r\n sequences in its input to plain \n
Given that this is specified behavior, we can't change it. At best, we can add a Reader
option to preserve newlines without mangling.
well but i think we're talking about csv.Writer.UseCRLF
here
the only explanation is:
If UseCRLF is true, the Writer ends each output line with \r\n instead of \n.
i suggest we add a StrictMode bool
field into
struct Writer {
...
}
so that by enabling it, Writer
would not change anything in our data
So the problem here is with csv.Writer.UseCRLF
enabled
csv.Writer
would also change our data in quote:
remove all \r
change \n
to \n\r
which is shown as
// Encode the special character.
if len(field) > 0 {
var err error
switch field[0] {
case '"':
_, err = w.w.WriteString(`""`)
case '\r':
if !w.UseCRLF {
err = w.w.WriteByte('\r')
}
case '\n':
if w.UseCRLF {
_, err = w.w.WriteString("\r\n")
} else {
err = w.w.WriteByte('\n')
}
}
field = field[1:]
if err != nil {
return err
}
}
Ms-excel will interpretive the \r
in fields to `. And we must to set
UseCRLF=true` for ms-excel. What a pity.
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
trying to write
into csv file
but the newline in
asd\njk
has been change toasd\r\njk
playground
What did you expect to see?
\n
in data field would not be changed bywriter.UseCRLF
"col1,col2\r\n\"asd\njk\",2g9\r\n"
What did you see instead?
"col1,col2\r\n\"asd\r\njk\",2g9\r\n"