nkiru-ede / MavenNetworkStudy

Other
0 stars 0 forks source link

add raw and normalised data #1 #2

Open nkiru-ede opened 4 months ago

nkiru-ede commented 4 months ago

add /data folder with subfolders /data-raw' containing the data from [Benelallam19] and /normalisedwith .tsv files for GAV / GA/ V graphs as discussed in paper and on board and aschema.tsv` file describing columns. Each row defines one dependency (edge), with 10 columns, 5 columns for each source and target:

groupId (string) artifactId (string) version (string) release-date (date) release-year (date) Note: this is not normalised but simple to process.

Also add scripts to transform raw data into normalised data.

jensdietrich commented 4 months ago

A few questions to clarify (in readme):

  1. why are the scripts too large, I think those should only be a few lines of (python/R,.. ?) code
  2. how are the time stamps for GA / G calculated ? Last version / first version ? Please document. Are there usecases to have both ?
  3. Please add command to readme how to reproduce this, something like "run script .. with input file .. this will produce output file .. "
  4. Are there unit tests cases for those scripts ?