Open ogomezdi opened 2 months ago
/cc @mkouba (qute)
Escaping is implemented as a io.quarkus.qute.ResultMapper
that is applied to all output expressions in a template before the evaluated string value is written to the output. See also io.quarkus.qute.HtmlEscaper
. You can register a custom ResultMapper
with the EngineBuilder#addResultMapper(ResultMapper)
method.
However, if you need more context (string in an attribute value) you can't apply the escaping globally...
Despite there is a workaround by creating a @TemplateExtension or @templatedata to access encode static functions available in Apache Commons Texts and others, but I think this must be a core functionality.
I'm not quite sure this should be a core functionality.
Use the same approach for escaping XML content, may be including in quarkus.qute.escape-content-types property "application/json" joined to the characters escape policy.
If you want to apply the default escaping to json files then just add application/json
to the list of the media types in the quarkus.qute.escape-content-types
config property :shrug: . Keep in mind that Qute has to be able to detect the content-type of a template file (which is not always the case; depends on use case).
Let me try to explain better.
Nowadays Character Escapes only apply to content included in a HTML or XML template, but if we use a template to generate a Json output, and we want to assign string values to a key in the Json and the values include quotes, newlines, ... the resulting output will not create a valid json.
For example you have a template like this:
{ "key": "{variable}" }
And you bind variable which has a doble quote inside, imagine hello "world"
. You'll have this output:
{ "key": "hello "world"" }
Which is not a valid json, that must be:
{ "key": "hello \"world\"" }
I hope the explanation helps understanding the issue.
Yes, I do understand your example ;-)
My point is that you probably can't apply a global escaping to a JSON template because you need some "context" in order to apply the correct escaping. For example, in a template like { "key": "{variable}", "key2": {anotherVar} }
where variable=hello "world"
and anotherVar="hello world"
you can't simply escape all double quotes in all output expressions, right?
So a template extension that would escape quotes is IMO the way to go: { "key": "{variable.asJsonValue}", "key2": {anotherVar} }
:
@TemplateExtension
class MyExtensions {
static String asJsonValue(String val) {
// escape quotes etc.
}
}
This is the workaround I've mentioned in the issue description. But, What I'm specting is to have something similar to the approach applied for XML, I mean https://quarkus.io/guides/qute-reference#character-escapes where the variable content is automatically escaped in case we're in a XML/HTML template, so if you use it in this, portion of template <tag>{variable}</tag>
, the qute engine automatically apply de escape XML rules to variable without needing to execute any method or extension explicitly. So for me translating this behavior when users try to obtain a JSON output from qute, also including the raw
option if possible :-), is the aim of this request.
I think that I understand your use case but the question is: should we apply escaping for the whole JSON template? If so, what would be the rules? For XML/HTML it's simple. For JSON I'm not so sure...
I'm not sure to understand what do you mean when talking about the "whole JSON template", if you mean in any instance of variables to render in a template, my answer will be yes, as users will be able to apply .raw
in places where they need to avoid this behavior.
Regarding the rules to apply may be we can rely on other implementations like https://commons.apache.org/proper/commons-text/apidocs/org/apache/commons/text/StringEscapeUtils.html#escapeJson(java.lang.String)
I'm not sure to understand what do you mean when talking about the "whole JSON template", if you mean in any instance of variables to render in a template, my answer will be yes, as users will be able to apply
.raw
in places where they need to avoid this behavior. Regarding the rules to apply may be we can rely on other implementations like https://commons.apache.org/proper/commons-text/apidocs/org/apache/commons/text/StringEscapeUtils.html#escapeJson(java.lang.String)
In that case, somethig like this should work:
public class JsonEscaper implements ResultMapper {
@Override
public boolean appliesTo(Origin origin, Object result) {
if (result instanceof RawString) {
return false;
}
Optional<Variant> variant = origin.getVariant();
if (variant.isPresent()) {
return variant.getContentType().startsWith("application/json");
}
return false;
}
@Override
public String map(Object result, Expression expression) {
return StringEscapeUtils.html.escapeJson(result.toString());
}
and then register the mapper:
class MyBean {
void configureEngine(@Observes EngineBuilder builder) {
builder.addResultMapper(new JsonEscaper());
}
}
I don't think we want to add commons-text
as a dependency. However, we could implement a simple implementation similar to io.quarkus.qute.HtmlEscaper
.
Feel free to send a pull request ;-).
I've found a easier implementation for encoding in other library that can fit in the way you've mentioned, may be something like this:
public static String escapeJson(String value) {
if (value == null)
return "";
StringBuilder b = new StringBuilder();
for (char c : value.toCharArray()) {
if (c == '\r')
b.append("\\r");
else if (c == '\n')
b.append("\\n");
else if (c == '\t')
b.append("\\t");
else if (c == '"')
b.append("\\\"");
else if (c == '\\')
b.append("\\\\");
else if (c == ' ')
b.append(" ");
else if (isWhitespace(c)) {
b.append("\\u"+JsonEscaper.padLeft(Integer.toHexString(c), '0', 4));
} else if (((int) c) < 32)
b.append("\\u" + JsonEscaper.padLeft(Integer.toHexString(c), '0', 4));
else
b.append(c);
}
return b.toString();
}
public static boolean isWhitespace(int ch) {
return JsonEscaper.existsInList(ch, '\u0009', '\n', '\u000B','\u000C','\r','\u0020','\u0085','\u00A0',
'\u1680','\u2000','\u2001','\u2002','\u2003','\u2004','\u2005','\u2006','\u2007','\u2008','\u2009','\u200A',
'\u2028', '\u2029', '\u202F', '\u205F', '\u3000');
}
public static boolean existsInList(int value, int... array) {
for (int i : array)
if (value == i)
return true;
return false;
}
public static String padLeft(String src, char c, int len) {
StringBuilder s = new StringBuilder();
for (int i = 0; i < len - src.length(); i++)
s.append(c);
s.append(src);
return s.toString();
}
One more question, will this mapper be included in the default engine builder like io.quarkus.qute.HtmlEscaper
in the io.quarkus.qute.runtime.EngineProducer
constructor?
And finally, I think there is an error in the code you've proposed return variant.getContentType().startsWith("application/json")
must be return variant.get().getContentType().startsWith("application/json")
Looks good.
isWhitespace(int ch)
could be probably replaced with java.lang.Character.isWhitespace(int)
?
A perf tip - if possible use the initial capacity for the StringBuilder
(e.g. src.lenght() + len
in padLeft()
).
One more question, will this mapper be included in the default engine builder like
io.quarkus.qute.HtmlEscaper
in theio.quarkus.qute.runtime.EngineProducer
constructor?
It depends. I mean if you register it with builder.addResultMapper(new JsonEscaper())
then it will live alongside the HtmlEscaper
.
And finally, I think there is an error in the code you've proposed
return variant.getContentType().startsWith("application/json")
must be returnvariant.get().getContentType().startsWith("application/json")
:+1:
A perf tip - if possible use the initial capacity for the StringBuilder (e.g. src.lenght() + len in padLeft()).
May be easy for padLeft() use -> String.format("%4s", Integer.toHexString(c)).replace(' ','0')
So final code should be:
public static String escapeJson(String value) {
if (value == null)
return "";
StringBuilder b = new StringBuilder();
for (char c : value.toCharArray()) {
if (c == '\r')
b.append("\\r");
else if (c == '\n')
b.append("\\n");
else if (c == '\t')
b.append("\\t");
else if (c == '"')
b.append("\\\"");
else if (c == '\\')
b.append("\\\\");
else if (c == ' ')
b.append(" ");
else if (java.lang.Character.isWhitespace(c)) {
b.append("\\u"+String.format("%4s", Integer.toHexString(c)).replace(' ','0'));
} else if (((int) c) < 32)
b.append("\\u"+String.format("%4s", Integer.toHexString(c)).replace(' ','0'));
else
b.append(c);
}
return b.toString();
}
Avoid java.lang.Character.isWhitespace
if explicitly import
Description
It'll nice to have the same behavior for Json template for character escapes like the one we have for XML content explained here https://quarkus.io/guides/qute-reference#character-escapes allowing users to escape content we want to include, for example, as string in an attribute value, for those values that contain quotes, new line, ... characters. Despite there is a workaround by creating a @TemplateExtension or @TemplateData to access encode static functions available in Apache Commons Texts and others, but I think this must be a core functionality.
Implementation ideas
Use the same approach for escaping XML content, may be including in quarkus.qute.escape-content-types property "application/json" joined to the characters escape policy. Other easy approach will be include in "Built-in Template Extensions" one for Strings, allowing encoding and unencoding options.