elves / elvish

Powerful scripting language & versatile interactive shell
https://elv.sh/
BSD 2-Clause "Simplified" License
5.75k stars 303 forks source link

Recent performance/speed benchmarks? #1373

Open dumblob opened 3 years ago

dumblob commented 3 years ago

Today I've come across an older Advent of Code - and was surprized by the slowness of Elvish. Is Elvish nowadays at least 10x better?

Related: https://github.com/elves/elvish/issues/1355

dumblob commented 3 years ago

If it's faster than in 2019/2020, you could consider adding it to https://github.com/kostya/benchmarks/ which is as of now the state-of-the-art unbiased micro benchmark.

krader1961 commented 3 years ago

@dumblob, The focus has been on

a) Adding needed features. Such as the math, re, str modules.

b) Increasing test coverage.

c) Improving the documentation.

d) Making changes that move the project closer to a 1.0.0 release.

Improving performance ranks last on the list of priorities because the performance is generally good enough for an interactive shell and shell scripts. When that isn't true the necessary changes are made. For example, until recently using the fzf program to select from the command history was annoyingly slow. So I implemented a set of changes that made using fzf fast enough that it is indistinguishable from the builtin history list mode that is implemented entirely in Go. See issue #1135.

The project would definitely appreciate help from anyone motivated to profile the code when running realistic programs and improve its performance. Hint, hint. 😸

dumblob commented 3 years ago

The priorities are understandable. I just wanted to get up to date :wink:.

I'm too new to Elvish to write good scripts/apps in it, but I'll leave this open if anyone wants to write a brainfuck interpreter in Elvish etc. as listed in https://github.com/kostya/benchmarks/ .

Btw. with interpreters I tend to compare them to Python (which is notoriously hard to optimize) based on the thought that transpilation to Python should not result in a faster execution.

krader1961 commented 3 years ago

@xiaq Recently stumbled across this article that explains how the Valgrind cachegrind tool can be leveraged to capture useful performance metrics with very little noise (variability): https://pythonspeed.com/articles/consistent-benchmarking-in-ci/. Leveraging that tool would be a good basis for writing Elvish language benchmarks whose results can be compared over a long base line.

krader1961 commented 2 years ago

I took a few minutes to install Valgrind on one of my Linux VMs and the cachegrind.py program mentioned in the document I linked to in my previous comment. I then benchmarked my Elvish program that compares for _ [(range $n)] with range $n | each. The short answer is that the single metric produced by cachegrind.py closely matches the actual difference in run time when comparing the behavior of commit 64661d6b to the prior commit: 196600812 to 576508958. A ratio of approximately 2.9. I'm more convinced than ever that collecting execution data via Cachegrind is the best way to track how changes affect Elvish performance over a long baseline and its use should be integrated into the Elvish continuous integration ecosystem. Not to mention making it easy to use via a make target on a developer's local system so that anyone can readily determine the impact of a change they are working on.

krader1961 commented 2 years ago

Another data point is the time to get the first interactive prompt. Prior to commit 64661d6 it is 127 ms on the Linux VM I used for the prior measurements and a cachegrind.py benchmarks metric of 316144122. With that commit the time is 106 ms and a metric of 299074197. That is a best of three in each case since there is more variability here than is typical for non-interactive benchmarks. The ratio of the two metrics is ~0.83 for the elapsed time and ~0.95 for the cachegind.py metric. Not great but still a positive correlation.

See also issue #1355 that talks about whether, and how, to collect performance data of Elvish programs. I am now inclined to think we can, and should, do so using Valgrind's "cachegrind" tool and a set of Elvish benchmark programs. Specifically, programs written in Elvish separate from the Go benchmarks exercised via go test -bench=. ./.... Benchmarks written in Go for this project certainly have value but are mostly irrelevant to anyone evaluating whether Elvish is getting more performant over time. For most people the question is how do programs written in Elvish perform -- not whether a specific Elvish Go function has gotten faster or slower.

An obvious question is how to handle all the explicit tests of the Elvish language in tests written in Go with respect to tracking performance. It seems to me the answer is they should be treated as basic correctness tests; not performance tests. For the purposes of this issue those Elvish tests, wrapped by Go tests, should be ignored. Instead, there should be a distinct set of Elvish programs that are executed under the control valgrind --tool=cachegrind.