libgit2 / pygit2

Python bindings for libgit2
https://www.pygit2.org/
Other
1.61k stars 385 forks source link

Walker? #1055

Open RossBoylan opened 3 years ago

RossBoylan commented 3 years ago

https://www.pygit2.org/commit_log.html describes a Walkerobject, but does not explain how to get one or how to use it. Are we to infer that the the iterator produced by the call to Repository.walk() immediately above is an object of type Walker? When I debug that seems to be the case.

For example, if headsis a list of heads and w a Walker, will

for h in heads:
    w.push(h)
    for cmt in w:
        do_stuff(cmt)

call do_stuffexactly once (i.e., the walker remembers and ignores commits already visited on later calls) for each commit in my repository that is reachable from any head?

RossBoylan commented 3 years ago

The answer to my last question appears to be No--the sample code will visit the same commits repeatedly. However, by keeping a list heads (actually, their associated id) that were previously pushed, and executing hide() with each of those ids after the push I was able to avoid visiting nodes more than once, I think.

Here's a fragment of code illustrating what seemed to work. It is not self-contained:

n = 0
traverser = Traverser()
wlk = None
for branch in repo.branches:
    traverser.newBranch(branch)
    # repo.lookup_branch(branch) returns None!
    commitid = repo.branches[branch].resolve().target
    if wlk is None:
        # pygit2.GIT_SORT_REVERSE makes traversal go forward in time
        wlk = repo.walk(commitid, pygit2.GIT_SORT_TIME | pygit2.GIT_SORT_REVERSE)
        seen = [ commitid ]
    else:
        wlk.push(commitid)
        for seenid in seen:
            wlk.hide(seenid)
        seen.append(commitid)
    for commit in wlk:
        n += 1
        traverser.newCommit(commit)