Improve performance - Githubissues

sifive / wit

Workspace Integration Tool

Apache License 2.0

23 stars 13 forks source link

Improve performance #171

Closed jackkoenig closed 5 years ago

jackkoenig commented 5 years ago

I noticed some low-hanging fruit performance issues, this PR fixes 2 of them:

passbyval was eating a ton of time, the code shows that only shallow clones are necessary
Some git commands were repeated, this caches the most common ones

There is a small risk of bugs here, I may have misunderstood the uses of passbyval and maybe my caching strategy is incorrect. In fact, I got the caching wrong the first time which is why the GitRepos now record "known commits" and will only cache things keyed by known commits.

In any case, this is a huge improvement in performance to common operations like wit status, benchmarking results to follow

jackkoenig commented 5 years ago

Benchmarking results on an internal SiFive package with lots of dependencies

“wit status”
Dual-core Macbook Pro
master: 9.07 s +/- 0.52
passbyval: 8.20 s +/- 0.86
caching: 6.68 s +/- 0.72

Linux Server
master: 2.92 s +/- 0.28
passbyval: 1.94 s +/- 0.15
caching: 1.78 s +/- 0.19

Caching didn't help as much as I had hoped on the Linux server, but helps a ton on my Macbook. To do much better than this, we need to avoid so many subprocesses. My guess is that integrating GitPython would be the best approach. Alternatively, we could make it possible to issue multiple git subprocesses at once and overlap the subprocess cost. I think the latter is kind of hard but probably doable.

jackkoenig commented 5 years ago

~~It appears that using the lru_cache utility erases performance gains:~~ See EDIT below

wit status on different large workspace than the above:

Linux server

commit	runtime
master	3.33 s +/- 0.20
passbyval	2.08 s +/- 0.24
manual caching	1.77 s +/- 0.16
use lru	2.37 s +/- 0.23
fix lru	1.72 s +/- 0.17

My MBP

commit	runtime
master	6.71 s +/- 0.17
passbyval	5.88 s +/- 0.12
manual caching	4.78 s +/- 0.1
use lru	6.03 s +/- 0.20
fix lru	4.80 s +/- 0.06

EDIT: So I didn't know what I was doing, but with @richardxia's help, the lru caching actually caches now and is the same speed as manual caching