Open mvanbrab opened 2 years ago
Hi @mvanbrab !
Do you get the same behavior if the data is in JSON?
Well no, but in equivalent JSON input I have to provide two backslashes where I mean one, so this is the input then and the output is OK (also contains the same amount of backslashes as the input), but that seems like a no-brainer to me...
[
{
"id": "1",
"description": "One backslash: \\."
},
{
"id": "2",
"description": "Two backslashes: \\\\."
},
{
"id": "3",
"description": "Three backslashes: \\\\\\"
},
{
"id": "4",
"description": "A backslash before the '{': a ∈ ℝ⁺₀\\{1}."
}
]```
Aha! That confirms my hypothesis: a few versions ago, we switched from Apache CSV to Open CSV to parse CSV files.
That library is probably eating the \
characters for lunch.
It seems that this is a common problem: https://dzone.com/articles/properly-handling-backslashes-using-opencsv
Tested the solution from dzone.com. It works for this case but breaks other edge cases.
Found on version 5.0.0 and 4.15.0:
Single backslashes are eaten, double backslashes are output as double backslashes, triple backslashes are output as double backslahes.
A testcase is provided in attached file.
Remark: version 4.12.0 always output twice the number of backslashes in the input.
issue.zip