Currently all output appears to be escaped by org.apache.commons.lang.StringEscapeUtils::escapeJava, which appears to be designed to escape strings for usage in java code (i.e. strings such escaped could be copy-pasted directly into a .java file). Apparently this includes a encoding of non-ascii characters into a \u[codepoint] format. The CSV reader of our choice did not expect this. I propose adding the option to not escape the output in this way. If no double quotes or line breaks appear in the original string, this is perfectly fine when dealing with CSV files.
Additionally, all instances of PrintStream are new-ed using a single-argument constructor, a such constructed PrintStream apparently reduces all non-ascii characters to question marks (?). To allow for utf8 output, these could simply be replaced by three parameter constructors by following substitution:
new PrinstStream(param) -> new PrintStream(param, false, StandardCharsets.UTF_8.name());
where false is the autoflush setting which is false in the single-parameter constructor.
It would be even better to allow type-specific escapes (in the case of CSV: escape double quotes by doubling them), but this could be a separate effort.
Currently all output appears to be escaped by org.apache.commons.lang.StringEscapeUtils::escapeJava, which appears to be designed to escape strings for usage in java code (i.e. strings such escaped could be copy-pasted directly into a .java file). Apparently this includes a encoding of non-ascii characters into a \u[codepoint] format. The CSV reader of our choice did not expect this. I propose adding the option to not escape the output in this way. If no double quotes or line breaks appear in the original string, this is perfectly fine when dealing with CSV files.
Additionally, all instances of PrintStream are new-ed using a single-argument constructor, a such constructed PrintStream apparently reduces all non-ascii characters to question marks (?). To allow for utf8 output, these could simply be replaced by three parameter constructors by following substitution:
new PrinstStream(param) -> new PrintStream(param, false, StandardCharsets.UTF_8.name());
where false is the autoflush setting which is false in the single-parameter constructor.
It would be even better to allow type-specific escapes (in the case of CSV: escape double quotes by doubling them), but this could be a separate effort.
I would be happy to create a merge-request.