gotec / git2net

An Open Source Python package for the extraction of fine-grained and time-stamped co-editing networks from git repositories.
https://git2net.readthedocs.io
GNU Affero General Public License v3.0
53 stars 16 forks source link

Add paper workflow as an example in the repository #29

Closed bkmgit closed 2 years ago

bkmgit commented 2 years ago

Is your feature request related to a problem? Please describe. The notebook reproducing the paper results available at https://zenodo.org/record/2587483#.XK4LPENoSCg is very nice, but may be out of date with the current version of the library

Describe the solution you'd like Add the examples to the current repository to supplement the Tutorial , and perhaps do tests to ensure they work with updated dependencies

Describe alternatives you've considered None so far.

Additional context Happy to help with a pull request on this.

gotec commented 2 years ago

Hi Benson,

I agree that the notebook on Zenodo is undoubtedly out of date with the current library as I have spent a lot of time working on the library since releasing the paper. This means that git2net now has a whole load of new features (e.g. processing merge commits), and I have fixed several bugs. Notably, one bug was related to the options used to call git blame. Overall, with the latest version of git2net, we can now obtain additional co-edits from merges and sometimes different co-edits from other commits. In addition, the igraph repository has received a lot of new commits, changing the mined data if cloned again today (or at least the latest version of it). I was expecting some significant changes to happen with the further development of the tool. Therefore, so far, I have kept the reproducibility package for the original paper and the current version of git2net as two separate things.

That said, I agree with you that the current tutorial could be nicely extended by adding, e.g., examples on time-series analysis and code-ownership as an additional section showing things one could do with git2net at the end of the tutorial. This could be very beneficial for new users trying to get started.

Is this what you were looking for? Or were you hoping to reproduce the results from the paper in the tutorial exactly?

In the first case, I would be happy to accept a pull request if you have the time to create one. However, I'd suggest continuing with the git2net repository (vs the igraph repository used in the reproducibility package) for the examples, as this is the running example throughout the tutorial.

Cheers, Christoph

bkmgit commented 2 years ago

Interest is in seeing what the tool can do, and the paper gives a more extensive overview than the example. Will create a pull request over the next week or so.

gotec commented 2 years ago

Perfect, then we completely agree. Thanks a lot for your effort. I am looking forward to the pull request :)

Cheers, Christoph

bkmgit commented 2 years ago

Added some of the content from the paper that worked without requiring too many changes. Thanks for the examples.

gotec commented 2 years ago

Thanks to your interest, I've made the time to significantly extend the tutorials for git2net and move them to a separate repository. The tutorials now also contain the remaining examples from the original reproducibility package. I have also brought back the integration of Binder that you proposed in the pull request.

I hope this resolves your issue and ultimately gives everyone a better experience getting started with git2net.

Let me know in case you find any further issues with them or feel free to close this issue otherwise.

Cheers, Christoph