casid / jte

Secure and speedy templates for Java and Kotlin.
https://jte.gg
Apache License 2.0
819 stars 59 forks source link

Potential escaping optimization #307

Open agentgt opened 10 months ago

agentgt commented 10 months ago

@casid as promised I wanted to share with you some potential optimizations I found while modernizing JMustache (a runtime based Mustache engine) escaping.

https://github.com/jstachio/escape-benchmark

The benchmark just tests escaping algorithms and not full template engines.

The benchmark and algorithm of StreamEscaperBenchmark.streamSubstring2 could possibly be put in JTE. I think currently JTE is analogous to StreamEscaperBenchmark.streamCharAt but without the inner loop. JStachio I believe is closest to StreamEscaperBenchmark.streamSubstring at the moment.

Anyway feel free to close the issue or move to discussion but wanted to share the benchmark repo in case you might find it interesting.

casid commented 10 months ago

Thanks for sharing.

I'm quite happy with the current jte implementation though. It is very easy to understand, quite fast and does not do any memory allocations. https://github.com/casid/jte/blob/main/jte-runtime/src/main/java/gg/jte/html/escape/Escape.java

agentgt commented 10 months ago

FWIW it’s unlikely I will change jstachio escaper (which appears to be similar to JTE) based on the benchmark.

I just thought it was bizarre an array lookup beat a switch.

agentgt commented 10 months ago

@casid

I improved the documentation of the benchmark so if you can check it one last time.

JStachio and JTE do: StreamEscaperBenchmark.streamSubstringInlineSwitch. jstachio version (EDIT to link to original inline switch version) which I assume you are familiar with as you found the missing escape characters security bug (thank you again btw!). I'm not sure if you were influenced by JStachio escaping given the timing of found bug and when you rewrote escaping so I feel obligated to say I never benchmarked it but now have.

It was one of the worst performers and is why I kind of reached out to you.

I am not sure why switch expressions are so slow.

casid commented 10 months ago

Thanks for the update @agentgt.

I followed IntelliJ's recommendation and migrated to the new switch expression, when jte 3 required Java 17 as minimum version. If I read your benchmark results correctly, changing this to an old fashioned switch doubles the throughput? That's a change worth considering!

agentgt commented 10 months ago

I'm still testing across various machines and operating systems. It appears true Intel x64 machines are smart enough to make the inline switch expression fast and in some cases faster than the array lookup.

But AMD, and Apple Silicon (arm64) do not seem to optimize for it.

The reason I did the benchmark was because JMustache current escaping is very slow but I knew if I changed it I needed proof for the PR that whatever I switched to would be better (while JMustache does not have a lot of github stars it has an enormous amount of usage).

If inline switch expression was on par than that would have been more reason to base on JDK 17 instead of JDK 9 that JMustache is currently on.

Anyway I switched to the array lookup for JStachio to be consistent with JMustache as JMustache is kind of JStachio's reflective brother.

agentgt commented 10 months ago

Here are some of the results:

https://github.com/jstachio/escape-benchmark/blob/main/results.md

If you have spare cycles (pun intended) maybe test on your Windows Beast 😄

I have a surprisingly lack of access to Genuine Intel machines.

casid commented 9 months ago

Thanks for the update @agentgt.

I followed IntelliJ's recommendation and migrated to the new switch expression, when jte 3 required Java 17 as minimum version. If I read your benchmark results correctly, changing this to an old fashioned switch doubles the throughput? That's a change worth considering!