wyona / yanel

http://www.yanel.org
Apache License 2.0
10 stars 5 forks source link

Is there any UTF-16 support (Emojis, Emoticons) in Yanel? #94

Open baszero opened 2 years ago

baszero commented 2 years ago

The current Yanel / Yarep only supports UTF-8 encoded XMLs (<?xml version="1.0" encoding="UTF-8"?>). So while you technical "can" save emojis in such an XML, the Xerces Parser (2.11) will fail, because Emoticons are not allowed in UTF-8 XMLs.

So is there any solution how to use UTF-16 XMLs with Yanel?

A workaround would be to violate the rules and store Emojis in the current UTF-8 XMLs and avoid using Xerces (which means in the Yanel-Apps: never use the BasicXMLResource.java:getTransformedInputStream() method). Instead, the resource could read the XML via JAXB into a Java Bean and then pass the content from that to XSLT, like:

MyJAXB bean = XMLBindingHelper.read(...);
xslDocument.getRootDocument().append(....params from bean....);

But I don't like the approach that you have to hardcode every property that you want to pass up to XSL. It should be something that also passes up ALL the elements in the XML, so also a 1:1 approach like the Xerces one...

Any ideas?

baszero commented 2 years ago

Another workaround: you simply remove all Emojis in your application by filtering all input texts that can be submitted by users. For this I can recommend this library: https://github.com/vdurmont/emoji-java , you just use EmojiParser.removeAllEmojis(string). In my case this works great.