Open epage opened 2 years ago
git-branchless originally cached merge-bases on-disk
Then switched to Eden. @arxanas is it all eden or is it also the hooks incrementally updating eden's database?
it now uses a data structure called the "segmented changelog" to represent its commit graph. See https://github.com/quark-zju/gitrevset/issues/1 and commit message at https://github.com/arxanas/git-branchless/commit/f6c540fea8392223d604c4994b081b603b3df850 for more details. It answers O(log(n)) merge-base queries with very good real performance (~1ms per query). That particular implementation is GPL-3-only, unfortunately.
@arxanas
- Improve performance of git sl after rebase. In large repos, it can take several seconds to compute.
- This could probably be done using the post-update hook. There should be something about the lattice merge-base structure which we can exploit to quickly calculate new merge-bases for all the visible commits, something like if mb(local, main) = x and mb(main, main') = x then mb(local, main') = x (but there are some edge-cases due to the fact that nodes can have multiple parents).
Invalidation:
git branchless gc
(which is triggered by the pre-auto-gc
hook), but as per above, I didn't need to do that.Eden DAG:
git branchless init
(as currently it just happens the first time you need the DAG).Yeah, Eden sounds nice; too bad its GPL. Thats weird for me to say as I used to be a fan but there are enough people / companies that restrict GPL that I don't want the work I do to be restricted and I'm unlikely to create the next major piece of software that will be abused for its license. This also means I'm participating in the viral nature of license biases.
Thats a good point that incremental updates wouldn't be a big deal, even without hooks (I also want to avoid hooks). With a graph data structure, it seems like it'd be relatively easy to identify and trim all roots that aren't known (ie cache invalidation). Might be interesting to play with this at a later time if there is an alternative viable graph that supports serialization.
@arxanas how does git-branchless do for initializing on Linux? I'm assuming thats much worse than gecko-dev.
@epage here's the timing results:
$ (cd linux && rm -rf .git/branchless/dag/ && sudo purge && /usr/bin/time git sl)
35.89 real 29.53 user 1.89 sys
$ (cd linux && git rev-list --count HEAD)
1078940
$ (cd gecko-dev && rm -rf .git/branchless/dag/ && sudo purge && /usr/bin/time git sl)
21.52 real 17.32 user 1.35 sys
$ (cd gecko-dev && git rev-list --count HEAD)
787973
That's a little bit longer than you would expect just based on commit numbers (21.52s * 1078940 commits / 787973 commits = 29.47s
, but it actually takes 35s
). I'm guessing that it's somehow pessimized by linux
's use of many merge commits in contrast to gecko-dev
's avoidance of merge commits.
Please complete the following tasks
Description
In #216, we found that
git stack
can be slow calculating protected bases when some of them are far from each other or the topic branches.Version
v0.8.0
Steps to reproduce
Actual Behaviour
Takes 6s
Expected Behaviour
Should take <1s
Debug Output
No response