Merge perf/dynamic-programming

PythonNut commented 8 years ago

See #75 and related discussion.

I'll look into writing all my changes up tomorrow. Expect to see it added here, and amended to the original commit. Also expect to see the commits squashed.

@lewang any idea why cl-letf* doesn't seem to be working in emacs24.3.1?

lewang commented 8 years ago

@PythonNut I fixed the tests and did some refactoring -- I now prefer explicit parameter passing over closing over variables from working with clojure.

PythonNut commented 8 years ago

@lewang thanks. I have a few more experiments to do (and some tests to write), but we seem on good track for a merge in the near future.

lewang commented 8 years ago

@PythonNut I added time limited test for optimization.

PythonNut commented 8 years ago

I always had the strange feeling that the dynamic programming wasn't quite working. Sure, the new version is much faster, but it still exhibited some O(2ⁿ) characteristics.

Turns out, it's because of the way I'm memoizing nil. Formerly, nil was indistinguishable from "no value in cache". As of a9f26b2, flx-score is behaving about O(nm) which is awesome. Even super long queries finish in milliseconds.

PythonNut commented 8 years ago

I think I've done everything I planned on doing. If you have no inhibitions, I'm planning on merging soon.

lewang commented 8 years ago

I would like to see more comments and documentation (in README or elsewhere) of your optimizations before merging. Let's try to avoid bus factor problems. :)

PythonNut commented 8 years ago

I've added a good amount of source comments and a docstring for flx-find-best-match. Is that sufficient?

EphramPerdition commented 8 years ago

Hi, I was looking forward to a massive speedup in flx and so I took the new cool for a spin:

(setq gc-cons-threshold 20000000) ; flx recommended value
(benchmark-run 10000
  (flx-score "abes-long-night-out" "alnu"))

; master 
(1.851651035 1 0.16587543100000346)

; #76
(3.9531513560000002 10 1.698554310999981)

(setq gc-cons-threshold 800000) ; emacs default

(benchmark-run 10000
  (flx-score "abes-long-night-out" "alnu"))
; master 
(2.189949227 3 0.48683958200000177)

; #76
(6.391774491 25 4.122379336999998)

Looks like the processing time is unchanged but the new version involved dramatically more garbage, so GC overhead makes this 2-3 times slower overall. Perhaps micro-benchmarking has been lying to you about real-world perf. I realize this gives better asymptotics but, constants matter too.

I was wondering whether a much faster flx score function could be used directly to pick the TopN over all candidates in helm. But, at 2 seconds for 10000 strings that won't work very well, limitations of working in elisp I suppose.

flx still excels when ranking a relatively small number of good candidates though. In that scenario, squashing the complexity corner cases is probably still a worthwhile thing to go for.

PythonNut commented 8 years ago

@EphramPerdition

This is because lexical-binding was enabled. (fixed in db1a328) It should exhibit similar performance characteristics for simple matches now (my benchmarks show no conclusive difference).

I was wondering whether a much faster flx score function could be used directly to pick the TopN over all candidates in helm. But, at 2 seconds for 10000 strings that won't work very well, limitations of working in elisp I suppose.

I'm, unfortunately, unaware of a way to dramatically speed up flx in the case you describe. Right now, the vast majority of the time is spent in flx-process-cache. To me, the heatmap code looks pretty dense and non-optimizable.

lewang commented 8 years ago

That's interesting. I thought lexical-binding was supposed to yield performance improvements. But yeah, we don't need to enable it.

@EphramPerdition The comparison is very close now for me. Can you confirm?

(setq gc-cons-threshold 20000000) ; flx recommended value
(benchmark-run 10000
  (flx-score "abes-long-night-out" "alnu"))

(0.9688869999999999 1 0.05320400000000092)

(benchmark-run 10000
  (flx-score-old "abes-long-night-out" "alnu"))

(0.9650080000000001 1 0.05469099999999827)

(setq gc-cons-threshold 800000)

(benchmark-run 10000
  (flx-score "abes-long-night-out" "alnu"))

(1.4284350000000001 8 0.38395399999999924)

(benchmark-run 10000
  (flx-score-old "abes-long-night-out" "alnu"))

(1.205603 6 0.27712799999999405)

EphramPerdition commented 8 years ago

Yes, the performance is back to baseline.

I had no idea lexical-binding involved such a perf penalty either. Good to know.

lewang commented 8 years ago

I asked on emacs-devel about the lexical-binding slowdown. @PythonNut Can you cut a new release branch with #76 and #77 and bump the version?

PythonNut commented 8 years ago

@lewang this is now part of #78.

lewang / flx

Merge perf/dynamic-programming #76