sillsdev / chorus

End-user collaboration library via 3-way xml merging and hg dvcs under the hood
6 stars 26 forks source link

Speed up Send/Receive operations significantly #258

Closed rmunn closed 3 years ago

rmunn commented 3 years ago

The Send/Receive code does a lot of hg log -rREV --template "{node}" calls in order to learn the long hash of a revision number. This happens very often during a Send/Receive process: the Chorus logs are full of hg log -r0 operations. Thing is, every single call to hg log incurs a cost as a Python process has to be spun up every time this happens. And in any given repository, a specific revision number wil always refer to the same revision with the same long hash, so each long hash only has to be calculated once for any given revision number. So we'll cache the output of log -rREV operations and return the cached output if the same operation runs a second time. (Other operations like hg parents aren't safe to cache, as their result will change during a Send/Receive operation as new commits are added to the repo. Only log -rREV calls are 100% safe to cache).

This is PR #257 applied to master (#257 is applied to the lfmerge branch) as suggested in https://github.com/sillsdev/chorus/pull/257#pullrequestreview-734803492


This change is Reviewable

rmunn commented 3 years ago

I wasn't seeing any occurrences of the "Using cached result" message in the Chorus logs while TeamCity ran the unit tests, and I just found out from looking at the logs (and then the source) that HgRepository.Execute() adds a sneaky little space character at the end of the command, so command.EndsWith("foo") wasn't returning true because the command actually ended with "foo ". Sigh. THIS version should actually fire the speedup code.

On the plus side, this means I can compare the time it took TeamCity to run the unit tests before and after commit 05fa900 and I'll have a poor man's benchmark. I still want to run a proper benchmark, but that'll help me get initial speedup numbers.

rmunn commented 3 years ago

Preliminary results: the TeamCity build including unit tests took 1374 seconds before commit 05fa900 went in (meaning it wasn't actually running the optimizations) and the build after commit 05fa900 went in took 858 seconds. That's 62.5% of the previous value, which means that 1/3 of the build's time was being spent running hg log over and over.

Unfortunately, now four unit tests are failing because they relied on HgRepository.Identifier changing from "000000" to a non-zero value when commits are added, and my code was caching the zero result from the first time the property was accessed. I'll have to look into those tests and figure out whether it's safe to change their logic so that they expect the identifier value to be cached, or whether I should do something else.

First thought: add some logic to the caching to examine the output and, if (and only if) the output is an all-zeroes value, do not cache that value as it's going to change in the near future. That should, I think, take care of all four of the failing unit tests. But I'm running out of time to test that today.

rmunn commented 3 years ago

Bingo. That worked. One more commit and I'll feel ready to merge this. I wrote the check to use only a dozen zeroes, but I think I should compare against the full all-0 SHA value, which is encoded somewhere in the code, so I can use a reference instead of a magic string. Once that commit is merged and tests passing, I'll merge this.