Closed dlemaignent closed 3 years ago
According to the SSE specification event streams are always UTF-8 decoded and encoding can't be changed.
9.2.1 Server-sent events: Introduction "Event streams are always decoded as UTF-8. There is no way to specify another character encoding."
Thank you for your answer. (i've made a mistake in my question, utf8EncodedString should be isoEncodedString...). I understand that EventSource (client side) always decode as UTF-8. But in don't understand why I need to convert my datas (to send in the event) as ISO when I build the event. Here an example:
applicationEventPublisher.publishEvent(SseEvent.builder().event(channel).data(jsonMessageString).build());
If I write my message objet as json string (utf-8) and send it in the event, the client side don't decode french accented characters.
byte[] bytes = jsonMessageString.getBytes(StandardCharsets.UTF_8);
String isoEncodedString = new String(bytes, StandardCharsets.ISO_8859_1);
applicationEventPublisher.publishEvent(SseEvent.builder().event(channel).data(isoEncodedString).build());
Now if I convert my json string to ISO_8859_1 like this example, accents are working..
Thanks
No idea what's going on. In Java all Strings are UTF-8. In your example jsonMessageString
is UTF-8 decoded and so is isoEncodedString
.
If we look at the bytes then we see that isoEncodedString
contains a completely wrong content.
I assume parsing a string with ISO_8859_1 takes each byte individually and encodes it as UTF-8
and you get the double length.
byte[] bytes = "èàé".getBytes(StandardCharsets.UTF_8);
for (int i = 0; i < bytes.length; i++) {
System.out.print(String.format("%x", bytes[i]));
System.out.print(" ");
}
//Output: c3 a8 c3 a0 c3 a9
String isoEncodedString = new String(bytes, StandardCharsets.ISO_8859_1);
byte[] isoEncodedBytes = isoEncodedString.getBytes();
for (int i = 0; i < isoEncodedBytes.length; i++) {
System.out.print(String.format("%x", isoEncodedBytes[i]));
System.out.print(" ");
}
//Output: c3 83 c2 a8 c3 83 c2 a0 c3 83 c2 a9
System.out.println(isoEncodedString);
//Output: èà é
System.out.println(new String(isoEncodedBytes));
//Output: èà é
Can you check what's going over the wire? I assume there is a bug in my library.
Is it possible to make enconding configurable with "text/event-stream, charset=UTF-8" on createSseEmitter I'm doing something like that to get accents in javascript client side :
byte[] bytes = jsonMessageString.getBytes(StandardCharsets.UTF_8); String utf8EncodedString = new String(bytes, StandardCharsets.ISO_8859_1);
Thanks