This repository contains Maven repository data extracted in 2018 by Benelallam et al. The repo contains the Maven artifacts (Source artifacts), their dependencies (Target artifacts), and their release dates.
To illustrate the growth of Maven repository, across the years, we have included three datasets - GAV1(GAV1.tsv), GAV2(GAV2.tsv) and GAV3(GAV3.tsv) - the three datasets shows the maven artifacts, their dependencies and release dates.
To get a true sense of the actual growth of the Maven repository, we look further into the aggregation of GAV that ignores version (GA) and artifact id (G) - two data sets show this - GA.tsv contains maven artifacts (without versions), their dependencies and release dates
Schema_G.tsv, Schema_GA.tsv, Schema_GAV.tsv - outlines the nature of all datasets
Steps to replicate:
Use Python version 3.*, tested with 3.12.2
Install git (any version)
Run the below commands to install git lfs and pull datasets links_all and release_all
git lfs install
git lfs pull
Note: To install dependencies on MacOs, you may need to use 'brew' command
Input | Output |
---|---|
TODO: Zenodo dataset | Project/data/GAV |
Script | Project/datawrang.py |
Script | Input | Output |
---|---|---|
Project/aggregate_ga.py , Project/aggregate_g.py |
Project/data/GAV |
Project/data/GA.csv , Project/data/G.csv , Project/plot |
Tests | test/test_data | test\test_aggregate_ga.py , test_aggregate_g.py |
Script | Input | Output |
---|---|---|
RemoveLoops_Cycles.py | TODO: Zenodo dataset | Project/data/cleaned_data |
SCC.py | Project/data/cleaned_data | Project/data/condensed_dag |
Transitive_deps.py | Project/data/condensed_dag | Project/data/transitive_dependencies |
datawrang_trans.py |
Project/data/transitive_dependencies |
Project/data/GAV |
Project/aggregate_ga.py , Project/aggregate_g.py |
Project/data/GAV | Project/data/GA.csv , Project/data/G.csv , Project/plot |
Ensure Python is installed:
Open Command Prompt:
Win + R
, type cmd
, and press Enter
.git clone https://github.com/nkiru-ede/MavenNetworkStudy.git
Navigate to the repository directory:
cd MavenNetworkStudy\Project
Install dependencies:
pip install -r requirements.txt
or
pip3 install -r requirements.txt
python datawrang.py
python aggregate_ga.py
python aggregate_g.py
cd test
python test_aggregate_ga.py
python test_aggregate_g.py
python --version
git clone https://github.com/nkiru-ede/MavenNetworkStudy.git
Navigate to the repository directory:
cd MavenNetworkStudy\Project
Install dependencies:
pip install -r requirements.txt
or
pip3 install -r requirements.txt
python datawrang.py
python aggregate_ga.py
python aggregate_g.py