Closed Breee closed 3 years ago
Hi Breee,
Yes, this is something that can happen for many repositories. So far, I have mined almost 500 repositories with git2net. I have not yet found a case where git2net truly got stuck in the sense that it ran into an endless loop. However, I have come across some commits that took unreasonably long.
The good news is that all other commits are already stored in the resulting database. However, the unfinished commits might take a while depending on their content. Most likely, the commits are very large, i.e. containing many individual modifications or very large individual modifications.
The git2net tutorial (https://github.com/gotec/git2net/blob/master/TUTORIAL.ipynb) provides some pointers on how you can deal with them. First, you can look at the metadata of the remaining commits using git2net.mining_state_summary(git_repo_dir, sqlite_db_file)
. This also outputs the commit hash of the remaining commits so you can look them up on GitHub.
In most cases, I found that the commits contained either undetected binary files or were full imports of other projects. Usually, that justified excluding them from my analyses as they do not represent typical development behaviour. However, your use case for git2net might differ from mine :)
If you want to exclude them, you can then set maximum the number of modifications that git2net allows for commits (max_modifications
). Alternatively, you can also skip commits that take longer than a specified time (timeout
).
If you want to mine them, I'm afraid you'll have to wait until they're done. I have already tried to optimise these cases but found that the performance is limited by the runtime of git blame, especially in these cases.
Best, Christoph
Hey Christoph,
thanks for the detailed information. after waiting a night the mining completed successfully!
Greetings,
We've face an issue when mining several repositories. For example, git2net hangs when mining the commits. 433 commits were processed pretty quick, just the last 2 remaining ones not.
I've waited for like an hour, still not moving further.
when I cancel the procedure i get:
Do you have any idea why that can happen? Do we just have to wait much longer, or can it truly get stuck?