Closed uwekoenig closed 4 years ago
This is the unicode "REPLACEMENT CHARACTER" (https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character). Might be FastODS or not, I will check..
Here's my test code:
final OdsFactory odsFactory = OdsFactory.create(Logger.getLogger("issue-179"), Locale.US);
final AnonymousOdsFileWriter writer = odsFactory.createWriter();
final OdsDocument document = writer.document();
final Table table = document.addTable("issue-179");
final TableCellWalker walker = table.getWalker();
String s = "This is a µ";
walker.setStringValue(s);
walker.next();
walker.setStringValue(Arrays.toString(s.getBytes("UTF-8")));
String t = "And this is a μ";
walker.nextRow();
walker.setStringValue(t);
walker.next();
walker.setStringValue(Arrays.toString(t.getBytes("UTF-8")));
writer.saveAs(new File("generated_files", "issue-179.ods"));
And here's the output:
As you see, first one is \xc2\xb5 (MICRO SIGN) and second one is \xce\xbc (GREEK SMALL LETTER MU), same symbol but semantic is different.
My guess is that your source file is not encoded in utf-8 and when the conversion is made, μ in your encoding is converted to bytes and written to the file. When LibreOffice opens the file, the REPLACEMENT CHARACTER is used to handle a non utf-8 sequence. Illustration in Python:
>>> 'µ'.encode("cp1252")
b'\xb5'
>>> _.decode("utf-8", "replace") # would fail without replace
'�'
>>> ascii(_)
"'\\ufffd'"
Please check your source file encoding.
You are right. The problem came from another library. Sorry for that and thank you very much for the detailed explanation and you time.
When I try to write the sign µ in a cell I get as result \uFFFD instead (opened with LibreOffice).