mwunsch / handlebars.scala

A Scala implementation of the Handlebars templating language (a superset of Mustache).
Apache License 2.0
112 stars 40 forks source link

let's tune performance #44

Closed gaydenko closed 10 years ago

gaydenko commented 10 years ago

Is there some comparison/benchmarking results? I see, benchmarking is very subjective, but being accompanied with test conditions they have some sense :) So, some simple comparisons with, say, freemarker and scalate would be informative.

gaydenko commented 10 years ago

OK, I have got a small test comparing with FreeMarker. The model data contains a list of maps of nested case classes. FM is ~4.3 times faster. So I have looked at handlebars.scala with profiler and found the main bottle neck. To cure just replace a string in DynamicBinding:

    //debug("Invoking method: '%s' with arguments: [%s].".format(method.getName, args.mkString(",")))
    debug(s"Invoking method: '${method.getName}' with arguments: [${args.mkString(",")}].")

I got boosting 3x (three) times.

Just on the digging in the code I have found scala.xml and a map from Google are pointless dependencies. The map can be replace with java.util.concurrent.ConcurrentHashMap, and scala.xml call to escape a string can be replaced with, say

  protected def escapeMustache(value: String, unescaped: Boolean = true): String = {
    if (unescaped) {
      value
    } else {
      val len = value.length
      val s = new StringBuilder
      var pos = 0
      while (pos < len) {
        val ch = value.charAt(pos)
        if (ch == '<') s.append("&lt;")
        else if (ch == '>') s.append("&gt;")
        else if (ch == '&') s.append("&amp;")
        else if (ch == '"') s.append("&quot;")
        else s.append(ch)
        pos += 1
      }
      s.toString
    }
  }

Nevertheless performance still ~1.5 times below FM (and the last one isn't known to be fast). Probably, the team has ideas to further optimization to make the project suitable for wider production use cases. Hasn't it?

chrhicks commented 10 years ago

Thanks for looking into this!

Performance is important to us and we do plan on making improvements soon.

I'm curious if you out your benchmark code anywhere? I'd like to check out how you did it :)

On Friday, August 22, 2014, Andrew Gaydenko notifications@github.com wrote:

OK, I have got a small test comparing with FreeMarker. The model data contains a list of maps of nested case classes. FM is ~4.3 times faster. So I have looked at handlebars.scala with profiler and found the main bottle neck. To cure just replace a string in DynamicBinding:

//debug("Invoking method: '%s' with arguments: [%s].".format(method.getName, args.mkString(",")))
debug(s"Invoking method: '${method.getName}' with arguments: [${args.mkString(",")}].")

I got boosting 3x (three) times.

Just on the digging in the code I have found scala.xml and a map from Google are pointless dependencies. The map can be replace with java.util.concurrent.ConcurrentHashMap, and scala.xml call to escape a string can be replaced with, say

protected def escapeMustache(value: String, unescaped: Boolean = true): String = { if (unescaped) { value } else { val len = value.length val s = new StringBuilder var pos = 0 while (pos < len) { val ch = value.charAt(pos) if (ch == '<') s.append("<") else if (ch == '>') s.append(">") else if (ch == '&') s.append("&") else if (ch == '"') s.append(""") else s.append(ch) pos += 1 } s.toString } }

Nevertheless performance still ~1.3 times below FM (and the last one isn't known to be fast). Probably, the team has ideas to further optimization to make the project suitable for wider production use cases. Hasn't it?

— Reply to this email directly or view it on GitHub https://github.com/mwunsch/handlebars.scala/issues/44#issuecomment-53118498 .

gaydenko commented 10 years ago

Sorry, unfortunately the test was done on real internal (that is closed) project's models. Hope, you'll forgive me. I guess it is easy to identify speed up on any not trivial model.

gaydenko commented 10 years ago

A small further experiments result. Complete logging removing and replacing List with Seq (that is List[X] with Seq[X], List(x) with Seq(x), List.empty with Seq(), toList with toSeq) gives us another 10-15%.

BTW, I guess a logging can be avoided in not too large library. OTOH I have found very useful strictness FM is provided: not found key path target is treated as error (rather just replacement with empty string). Just for consideration..

gaydenko commented 10 years ago

This is a scetch of my test:

  val items = Await.result(itemsFu, 10.seconds)

  val data = Map("header" -> someCaseClass, "list" -> items)

  //-------------------------------------------------------------------------------

  def ftlStep = flt("ftltemplate", data)

  //-------------------------------------------------------------------------------

  val hb = someString
  import com.gilt.handlebars.scala.binding.dynamic._
  val t = com.gilt.handlebars.scala.Handlebars(hb)
  def hbStep = t(data)

  //-------------------------------------------------------------------------------

  def repeat = (1 to 1000).foreach { _ =>
    Perf.run("freemarker", 1, 1) { ftlStep }
    Perf.run("handlebars", 1, 1) { hbStep }
  }

  //-------------------------------------------------------------------------------

  repeat

where Perf.run is just

object Perf {

  def run(title: String, warmSec: Int, runSec: Int)(code: => Unit): Unit = {
    Log.console.info(s"$title, warming up $warmSec seconds..")
    val stopWarmAt = System.nanoTime + warmSec * 1000000000L;
    while (System.nanoTime < stopWarmAt) { code }
    Log.console.info(s"  ...is warmed up, running $runSec seconds..")
    var count = 0L
    val stopRunAt = System.nanoTime + runSec * 1000000000L;
    while (System.nanoTime < stopRunAt) {
      code
      count = count + 1
    }
    Log.console.info(s"Done with count $count")
  }
}

Then I just start it and look for result stabilization (that is waiting for JVM warming up).

gaydenko commented 10 years ago

Just on the way have noticed object CachingHandlebars starts with

  def apply[T](
    template: String,
    partials: Map[String, Handlebars[T]],
    helpers: Map[String, Helper[T]],
    sourceFile: Option[String])(implicit f: BindingFactory[T]): Handlebars[T] = {

    val parseResult = HandlebarsGrammar(template)

    parseResult.map {
      program =>
        CachingHandlebarsImpl(HandlebarsGrammar(template).get, partials, helpers, sourceFile)
    }.getOrElse(sys.error("Could not parse template:\n\n%s".format(parseResult.toString)))
  }

I guess it must be

  def apply[T](
    template: String,
    partials: Map[String, Handlebars[T]],
    helpers: Map[String, Helper[T]],
    sourceFile: Option[String])(implicit f: BindingFactory[T]): Handlebars[T] = {

    val parseResult = HandlebarsGrammar(template)

    parseResult.map {
      program => CachingHandlebarsImpl(program, partials, helpers, sourceFile)
    }.getOrElse(sys.error("Could not parse template:\n\n%s".format(parseResult.toString)))
  }

Or have I missed something?

gaydenko commented 10 years ago

I have spent these days playing with the project sources tree and have found modifications resulting in my test case to be ~1.5 times (besides already mentioned 3x) faster and - very important - with really much more transparent heap graph (I use Java VisualVM for profiling). Almost all files were modified :) If somebody from the team is interested in, I'm ready to send attached zipped source tree directly (sorry, I don't like forking... and all these ritual dances around github :)).

/* Now handlebars.scala is just ~15% slower than FreeMarker (I have also optimized a Scala wrapper for the last one), but has got much better memory profile. As a result I'm going to try to replace FreeMaker with handlebars.scala (from, at the moment, own source tree) in real middle-side project. Thanks the team! */

chrhicks commented 10 years ago

If love to see it ! Can you put it on drop box or something like that. Or send it to chrhicks at gmail

On Sunday, August 24, 2014, Andrew Gaydenko notifications@github.com wrote:

I have spent these days playing with the project sources tree and have found modifications resulting in my test case to be ~1.5 times faster and - very important - with really much more transparent heap graph (I use Java VisualVM for profiling). Almost all files were modified :) If somebody from the team is interested in, I'm ready to send attached zipped tree directly (sorry, I don't like forking... and all these ritual dances around github :)).

— Reply to this email directly or view it on GitHub https://github.com/mwunsch/handlebars.scala/issues/44#issuecomment-53192862 .

gaydenko commented 10 years ago

@chrhicks Have you got it? Or check your spam filter :)

chrhicks commented 10 years ago

Got it, thanks!

I am on vacation right now. Give me a few days to check it out :)

On Sunday, August 24, 2014, Andrew Gaydenko notifications@github.com wrote:

@chrhicks https://github.com/chrhicks Have you got it? Or check your spam filter :)

— Reply to this email directly or view it on GitHub https://github.com/mwunsch/handlebars.scala/issues/44#issuecomment-53212066 .

chrhicks commented 10 years ago

@gaydenko Applied as much of your changes as I could. Thanks for the interested in making scandlebars better!

It turned out the biggest improvement was unused Loggable trait. It went from ~4.9 ms to render in the test to 1.3ms. The rest of the enhancements have it down to under 1ms. You can see the benchmark I used (was used previously for the 0.0.x version of handlebars.scala): HandlebarsPerfSpec