Closed gopherbot closed 9 years ago
I've submitted a fix: https://go-review.googlesource.com/#/c/1576/ The fix is only for the reader. I was wondering if the writer should also have an option to specify the quote character.
Why?
Is there a standard that says the quote character can differ?
If you have a file format that uses a different quote character, is it really CSV?
I don't think so.
I'm compelled to close this without action. The encoding/csv package is small and easily forkable elsewhere for PQCSV ("pipe-quoted CSV") or whatever.
I believe the standard does not specify it. You are right in pointing out that the package is small and easily forkable. Just saw this issue open and thought of fixing it :)
The standard specifies it as double quote. It's not a tweakable parameter:
https://tools.ietf.org/html/rfc4180 says
escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE
...
DQUOTE = %x22 ;as per section 6.1 of RFC 2234 [2]
We'll follow the standard.
People needing wacky formats can use wacky packages.
CSV is well known for being a format ever program implements somewhat differently, like HTML in the early years of the internet. Yes, the standard says that the quote character is "
, but there are many programs out there that expect differently formatted data. Having a CSV package that is flexible in the way it generates output is a very useful thing. I am not sure if suggesting that every program that tries to generate CSV files which do not exactly match the standard shall just fork the encoding/csv
package is a good idea both in terms of code-reuse and reliability.
Can you provide a few examples of such programs, ideally popular ones?
Absent a compelling reason, I see no reason to introduce complexity for theoretical uses.
Even with examples, I'm tempted to say no just as a minor encouragement to those program's authors and users to do something more normal.
I know this is old but I would like to add an example data which desperately needs this change to be read. I am currently working with some sentiment analysis stuff which uses SentiWordNet list. A row from the list:
a
00001740
0.125
0
able#1
(usually followed by `to') having the necessary means or skill or know-how or authority to do something; "able to swim"; "she was able to program her computer"; "we were at last able to buy a car"; "able to get a grant for the project"
I put each column on a newline so it is easy to distinguish each column. The columns are tab (\t) delimited and long text does not have an enclosure. Obviously, because the long text already has double quotes in it, the csv package gives me an error trying to parse the quotes. Hope what I wrote makes sense lol.
CL https://golang.org/cl/23401 mentions this issue.
RFC4180 also mentioned the delimiter has to be comma (","), yet encoding/csv
supports changing this
@hasnickl This issue is closed. If you want to discuss this, please use the golang-dev mailing list. Thanks.
by fuzxxl: