Open janisdd opened 1 month ago
While looking into https://github.com/mholt/PapaParse/issues/1035 I noticed that the escapeChar is wrongly set when the quoteChar is changed. This is only the case for Papap.unparse.
escapeChar
quoteChar
Papap.unparse
Example:
Papa.unparse([['a', 'x+y']], {quoteChar: '+'})
which gives currently
a,+x""y+
As you can see, the escapeChar is still set to "" and not ++.
""
++
The issue was introduced in this commit: https://github.com/mholt/PapaParse/commit/a627547d87b6c155ccb46d61ee100c7a20581cf8#diff-fbb8d0b19dc4586fadd80f52a7b49415d2d41a5fdccaeaa728f7fca51cfbbb29R277
This can be easily be fixed when changing line https://github.com/mholt/PapaParse/blob/4af6882e0ea91b0baad30345f190b38d50e7535e/papaparse.js#L364C4-L364C29
if (typeof _config.quoteChar === 'string') _quoteChar = _config.quoteChar;
into
if (typeof _config.quoteChar === 'string') { _quoteChar = _config.quoteChar; _escapedQuote = _quoteChar + _quoteChar; }
And here is a test-case for the UNPARSE_TESTS
UNPARSE_TESTS
{ description: "Other quote char but without setting escape char (should be 2x the quote char)", input: [['a', 'x+y']], config: { quoteChar: "+"}, expected: 'a,+x++y+' }
However, fixing this will result in
a,x++y
The correct result would be
a,+x++y+
The other part of the issue is that in line https://github.com/mholt/PapaParse/blob/4af6882e0ea91b0baad30345f190b38d50e7535e/papaparse.js#L472
needsQuotes = needsQuotes || _quotes === true || (typeof _quotes === 'function' && _quotes(str, col)) || (Array.isArray(_quotes) && _quotes[col]) || hasAny(escapedQuoteStr, Papa.BAD_DELIMITERS) || escapedQuoteStr.indexOf(_delimiter) > -1 || escapedQuoteStr.charAt(0) === ' ' || escapedQuoteStr.charAt(escapedQuoteStr.length - 1) === ' ';
Will always set needsQuotes=true if the field contains a " because Papa.BAD_DELIMITERS is defined as ['\r', '\n', '"', Papa.BYTE_ORDER_MARK].
needsQuotes=true
"
Papa.BAD_DELIMITERS
['\r', '\n', '"', Papa.BYTE_ORDER_MARK]
There are different ways to fix this, here is one
Papa.NEED_QUOTES_CHARS=['\r', '\n']
str.indexOf(_quoteChar) > -1
These changed will also fix issue https://github.com/mholt/PapaParse/issues/1035
At this point, a PR would probably be easier... However, this can be implemented in different ways, so I'm hesitant.
While looking into https://github.com/mholt/PapaParse/issues/1035 I noticed that the
escapeChar
is wrongly set when thequoteChar
is changed. This is only the case forPapap.unparse
.Example:
which gives currently
As you can see, the
escapeChar
is still set to""
and not++
.The issue was introduced in this commit: https://github.com/mholt/PapaParse/commit/a627547d87b6c155ccb46d61ee100c7a20581cf8#diff-fbb8d0b19dc4586fadd80f52a7b49415d2d41a5fdccaeaa728f7fca51cfbbb29R277
This can be easily be fixed when changing line https://github.com/mholt/PapaParse/blob/4af6882e0ea91b0baad30345f190b38d50e7535e/papaparse.js#L364C4-L364C29
into
And here is a test-case for the
UNPARSE_TESTS
However, fixing this will result in
The correct result would be
The other part of the issue is that in line https://github.com/mholt/PapaParse/blob/4af6882e0ea91b0baad30345f190b38d50e7535e/papaparse.js#L472
Will always set
needsQuotes=true
if the field contains a"
becausePapa.BAD_DELIMITERS
is defined as['\r', '\n', '"', Papa.BYTE_ORDER_MARK]
.There are different ways to fix this, here is one
Papa.NEED_QUOTES_CHARS=['\r', '\n']
and use it here instead ofPapa.BAD_DELIMITERS
str.indexOf(_quoteChar) > -1
These changed will also fix issue https://github.com/mholt/PapaParse/issues/1035
At this point, a PR would probably be easier... However, this can be implemented in different ways, so I'm hesitant.