yogthos / Selmer

A fast, Django inspired template system in Clojure.
Eclipse Public License 1.0
976 stars 113 forks source link

Variable properties/nested accessors limited to keywords? #54

Closed wiseman closed 7 years ago

wiseman commented 10 years ago

Given an environment like {:query-params {"limit" "5", "q" "nirvana"}}, is there a way to access the members of query-params? {{ query-params.q }} doesn't seem to work.

yogthos commented 10 years ago

That's correct, at the moment the accessors have to either be keywords or numbers.

rwilson commented 8 years ago

A quick look at the django template spec doesn't reveal a corollary; it seems to not be an issue with python dictionary lookups. I think it can be reasonably fixed at the filter-parser level, in how the accessor is compiled.

Here are a couple ways it could be done:

1. For each step in an accessor sequence, try as a keyword and if that returns nil then try as a string.

The plus is that it doesn't introduce any new syntax and makes Selmer a little more consistent with some clojure mustache parsers. But, it seems undesirable, given:

(defn get-accessor [context-map accessor]
  (let [step (first accessor)]
    (when-let [v (or (get context-map step)
                     (when (keyword? step)
                       (get context-map (name step))))]
      (if (next accessor)
        (recur v (next accessor))
        v))))

Which is especially bad since it's on rendering, not compilation.

2. Support a new syntax for explicit accessor steps

Seems like the simplest solution is a sentinel char to indicate a string literal step in an accessor path. For example:

(render "Hello, {{ foo.'bar.0 }}"
        {:foo {"bar" ["Strings"]}})

=> "Hello, Strings"

In that case, using a single quote would indicate that step of the accessor should be treated as a string. It may be a downside that it coopts the single quote, which already has a meaning in clojure. And it may confuse syntax highlighters expecting quotes to be balanced.

A variation on this might be to actually quote the string as a string, e.g. "{{ foo.\"bar\".0 }} which could be ok in template files since then it won't require escaping the double-quotes constantly.

I'm going to implement a version of the latter approach in my fork and will issue a pull request.

Feedback on the approach is welcome.

rwilson commented 8 years ago

For now, I've added the version supporting double-quoting accessor steps. It's pushed here, if anybody wants to see, and deployed to clojars as [rwilson/selmer "1.1.0"].

yogthos commented 8 years ago

I think I'm leaning towards the first approach myself. The only reservation I have there is performance, might be worth benchmarking that to see if there's any real impact. It might make sense to just grab get-in from core and add the modification there. Then the performance should be more or less the same in the ideal case. Looks like reduce1 is the default implementation.

rwilson commented 8 years ago

Good idea, I hadn't thought to snag get-in, which turns out to be a hell of a lot simpler. Looks like it uses a private reduce1 fn in clojure core, but it works with the public reduce too.

Since we don't need to handle the not-found case, it's basically, (reduce get context-map accessor), which for this could be:

(defn accessor-get [m k]
  (or (get m k)
      (when (keyword? k)
        (get m (name k))))

(reduce accessor-get context-map accessor)

The perf difference should be limited. I'll going to update my fork and try this out for a bit.

rwilson commented 8 years ago

Ok, got a version I'm testing, commit 9a0e732

Also on clojars as [rwilson/selmer 1.0.3]

yogthos commented 8 years ago

Cool, if you don't find much of a performance difference (and there shouldn't be), then I'll be happy to merge and push it out. :)

rwilson commented 8 years ago

Took me a little while to get back to this and make sense of some initially confusing results.

In the simple case (maps with only keyword keys), the change significantly outperforms the original. The improvement holds for heavily nested lookups, but diminishes at each level. When a key isn't found, it falls back to baseline performance, as an additional failing get is performed looking for the key as a string.

The worst case (maps with only string keys) depends on nesting. In the flat case, it performs at baseline. In nested cases, it degrades quickly.

Overall: for an occasional string key in a nested map, it's acceptable. If the context-map isn't heavily nested, it may still be an improvement. If it's a little nested, a string key eats up the gain and gets us back to baseline. But, if the map is heavily nested and has many layers of string keys, then the lookup is significantly slower.


First, why is the change faster at supporting the current all-keyword keys behavior? That's weird, right?

So, (get-in m ks) is implemented as (reduce1 get m ks). I don't know much about reduce1, but it's used internally by a number of core fns before reduce is defined, including get-in. Also, reduce1 is implemented in clojure while reduce seems to be a thin clojure delegate to java code (I stopped at that point in the rabbit hole).

It turns out that reduce is significantly faster than reduce1, unless I'm overlooking some behavior of reduce or of criterium, which I used for benchmarking.

Here's the data, for reference it was done with (criterium/quick-bench (reduce-fn get m ks)) where reduce-fn was either reduce or reduce1, map m was nested 1, 3, or 10 levels with keyword keys, and ks was a sequence of the full depth of the map.

Function Depth Time (ns) Std Deviation (ns)
reduce1 1 73.3 3.3
reduce1 3 102.2 4.1
reduce1 10 178.6 4.8
reduce 1 29.6 1.2
reduce 3 50.6 2.7
reduce 10 130.1 2.0

Run with (get-in m ks) instead of (reduce1 get m ks) is the same, which makes sense. This is the control group for the existing filter-parser behavior.


So, with that explaining the increase in best-case performance, here's the benchmark data for the introduction of filter-parser/get-accessor. For the percentages, positive is faster, negative slower.

Depth Control (ns) Best (ns) Worst (ns) Best (%) Worst (%)
1 77.7 29.6 70.3 61.9% 9.5%
2 91.2 44.1 120.5 51.7% -32.1%
4 108.2 70.8 225.6 34.6% -108.5%
8 159.6 125.3 421.1 21.5% -163.9%
16 257.0 218.1 806.0 15.1% -213.6%

The problem with the worst case is after the get-by-keyword fails, it performs a check on the key to see if it's a keyword, before trying to look it up as a string.

Anyway, if you think it's an acceptable solution, let me know and I'll create the PR. Or, if you have any suggestions on how to mitigate the worst case, I'm open to those too. I think it might be ok, because for the existing functionality it's negligible to improved, while supporting string key lookups incurs the extra cost, but is opt-in.

All tests run on a macbook pro with similar background conditions. Occasionally, criterium found outliers, in which case that test was rerun until there were no outliers.

yogthos commented 8 years ago

Awesome, thanks for doing a lot of legwork on the performance benchmarking. I think the results are acceptable and I would expect that most use cases will fall under the improved performance. I also agree that since the string keys are opt-in this isn't an issue in general.

So I'd be happy to merge the changes in and cut a new release. :+1: