mauricioaniche / repodriller

a tool to support researchers on mining software repositories studies
174 stars 39 forks source link
git mining-software-repositories msr software-engineering software-engineering-research

(Before looking into RepoDriller, I suggest you to check Pydriller, a Python version of RepoDriller, which is now faster and easier to use! I am keeping this repo here for historical purposes, but I don't plan to update it anymore!)

RepoDriller

Build Status

RepoDriller is a Java framework that helps developers on mining software repositories. With it, you can easily extract information from any Git repository, such as commits, developers, modifications, diffs, and source codes, and quickly export CSV files.

Take a look at our manual folder and our many examples. Or talk to us in our mailing list.

Advice to researchers

Difficulties in mining git

You should read this paper:

FAQs

Why use an MSR framework?

There's no question that Mining Software Repositories (MSR) studies benefit from automation. The datasets are too large to analyze manually.

So the choice is whether to use an MSR framework or to write your own scripts. An MSR framework offers two benefits:

How is RepoDriller different from other MSR frameworks?

RepoDriller is a minimalist's MSR framework, a lightweight tool for flexible analysis.

Here's how it compares to some other MSR frameworks and tools:

How do I cite RepoDriller?

For now, cite the repository.

Is there a discussion forum?

You can subscribe to our mailing list: https://groups.google.com/forum/#!forum/repodriller.

How do I contribute?

Required: Git, Maven.

git clone https://github.com/mauricioaniche/repodriller.git
cd repodriller/test-repos
unzip \*.zip

Then, you can:

License

This software is licensed under the Apache 2.0 License.