chains-project / bump

A dataset of reproducible breaking dependency updates, SANER 2024 (https://doi.org/10.1109/SANER60148.2024.00024)
MIT License
15 stars 5 forks source link

Add Github Compare Links via a manual heuristic #186

Closed LukvonStrom closed 4 months ago

LukvonStrom commented 5 months ago

Dear Bump authors,

thank you for this great dataset! After looking through it I identified a few repositories that were missing comparison urls, even though the repositories were available on github with enough tags. I figured that for most this is caused by the logic in https://github.com/chains-project/bump/blob/main/src/main/java/reproducer/DependencyRefLinkFinder.java#L47-L50 which does not handle cases where the repo is named differently than the artifact or when there is a monorepo.

As I was not able to come up with a clever heuristic on the spot, I instead hand-curated a small dataset mapping these packages to the correct github repository. I applied the changes via a python script that I have attached as well.