chains-project / bump

A dataset of reproducible breaking dependency updates, SANER 2024 (https://doi.org/10.1109/SANER60148.2024.00024)
MIT License
15 stars 5 forks source link
dependency-analysis java

BUMP Breaking Updates

Overview

Bump is a benchmark of breaking dependency updates. It can be downloaded from Zenodo. A breaking updates is defined as: a pair of commits for a Java project, which we designate as the pre-commit and the breaking-commit, typically performed by bots such as Dependabot and Renovate. When we build the project with the pre-commit, compilation and test execution are successful, while the build of the breaking-commit fails. Each breaking-commit is a one-line change in the Maven pom file.

If you use Bump, please cite:

@inproceedings{bump2024,
 title = {BUMP: A Benchmark of Reproducible Breaking Dependency Updates},
 booktitle = {Proceedings of SANER},
 year = {2024},
 doi = {10.1109/SANER60148.2024.00024},
 author = {Frank Reyes and Yogya Gamage and Gabriel Skoglund and Benoit Baudry and Martin Monperrus},
 url = {http://arxiv.org/pdf/2401.09906},
}

Download BUMP

All breaking updates in Bump are stored within Docker images. They can be downloaded from Zenodo.
To easily download the Zenodo tar file and load the associated Docker images use the following commands:
⚠️ Warning: You need a minimum of 250 GB of free disk space to load the images.

$ wget https://zenodo.org/records/10041883/files/bump.tar.gz
$ docker load -i bump.tar.gz # this loads 1142 images
$ docker images | wc -l
1142
# running a breaking commit
# docker run ghcr.io/chains-project/breaking-updates:<tag>{-pre,-breaking}
$ docker run ghcr.io/chains-project/breaking-updates:5769bdad76925da568294cb8a40e7d4469699ac3-breaking

Data format

Gathered data can be found as JSON files in the data folder. There are 3 sub-folders inside the data folder.

The JSON files in our benchmark of breaking dependency updates have the following JSON data format.

{
    "url": "<github pr url>",
    "project": "<github_project>",
    "projectOrganisation": "<github_project_organisation>",
    "breakingCommit": "<sha>",
    "prAuthor": "{human|bot}",
    "preCommitAuthor": "{human|bot}",
    "breakingCommitAuthor": "{human|bot}",
    "updatedDependency": {
      "dependencyGroupID": "<group id>",
      "dependencyArtifactID": "<artifact id>",
      "previousVersion": "<label indicating the previous version of the dependency>",
      "newVersion": "<label indicating the new version of the dependency>",
      "dependencyScope": "{compile|provided|runtime|system|import}",
      "versionUpdateType": "{major|minor|patch|other}",
      "githubCompareLink": "<the github comparison link for the previous and breaking tag releases of the updated dependency if it exists>",
      "mavenSourceLinkPre": "<maven source jar link for the previous release of the updated dependency if it exists>",
      "mavenSourceLinkBreaking": "<maven source jar link for the breaking release of the updated dependency if it exists>",
      "updatedFileType": "{pom|jar}",
      "dependencySection" : "{dependencies|dependencyManagement|buildPlugins|buildPluginManagement|profileBuildPlugins}"
  },
    "preCommitReproductionCommand": "<the command to compile and run tests without the breaking update commit>",
    "breakingUpdateReproductionCommand": "<the command to compile and run tests with the breaking update commit>",
    "javaVersionUsedForReproduction": "<the java version version used for reproduction>",
    "failureCategory": "<the category of the root cause of the reproduction failure>"
}

Workflow

The data gathering workflow is as follows:

Tools

The BreakingUpdateMiner

In order to gather breaking dependency updates from GitHub, a tool called the BreakingUpdateMiner is available.
You can build this tool locally using mvn package with Java 17. You can then run the tool and print usage information with the command:

java -jar target/BreakingUpdateMiner.jar --help 

The BreakingUpdateReproducer

In order to perform local reproduction once potential breaking uppdates have been found by the miner, a tool called the BreakingUpdateReproducer is available. You can build this tool locally using mvn package with Java 17. You can then run the tool and print usage information with the command:

java -jar target/BreakingUpdateReproducer.jar --help