TSELab / guac-alytics

A series of tools and resources to better understand the risk profile of open source software ecosystems
Apache License 2.0
2 stars 0 forks source link

Describe a Description of the Database #12

Closed SahithiKasim closed 1 year ago

SahithiKasim commented 1 year ago

Source table

The table consists of 4 columns:\ source_id - auto-incremented primary key for source \ source_name - name of the source package\ version - version of the source\ location - location from where the source is developed.

Buildinfo table

The table consists of 8 columns:\ buildinfo_id - auto-incremented primary key for builds\ source_id - corresponding source_id from source table\ type - the architecture for which the package was built\ build_origin - the origin of the build\ build_architecture - architecture used to perform the build\ build_date - the date the package was built or last edited \ build_path - path of the directory in which the package has been built\ environment - information needed to reproduce that build.

Binary table

binary_id - auto-incremented primary key for binaries\ package - binaries or dependent packages on source\ version - version of the package\ architecture - the architecture of the package.

Dependency table

buildinfo_id - the ids from buildinfo_table\ binary_id - the corresponding binaries to the build from the binary_table.

Output table

buildinfo_id - the ids from buildinfo_table\ binary_id - the corresponding binaries to the build from the binary_table\ checksum_md5 - list of files with a md5 checksum and size for each one\ checksum_sha1 - list of files with a sha1 checksum and size for each one\ checksum_sha256 - list of files with a sha256 checksum and size for each.

Popularity table

The table consists of 10 columns:\ name - is the name of the Debian package\ date - is on which vote value is updated\ inst - is the number of people who installed this package\ vote - is the number of people who use this package regularly\ old - is the number of people who installed, but don't use this package regularly\ recent - is the number of people who upgraded this package recently\ no-files - is the number of people whose entry didn't contain enough information\ maintainer - is the one who has created the package\ inst_norm - is where inst is multiplied with amd64 normalized value\ vote_norm - is where vote is multiplied with amd64 normalized value.

Maintainers table

The table consists of 7 columns:\ name - is the name of the maintainer\ package - are the packages created by the particular maintainer\ inst - is the number of people who installed this package\ vote - is the number of people who use this package regularly\ old - is the number of people who installed, but don't use this package regularly\ recent - is the number of people who upgraded this package recently\ no-files - is the number of people whose entry didn't contain enough information.

The database has the following schema:\ source_table:

source_id (Integer) \ source_name (varchar) version (varchar) location (varchar)
1 maxima 5.42.0-1
buildinfo_table: buildinfo_id (Integer) \ source_id (Integer) \ type (varchar) build_origin (varchar) build_architecture (varchar) build_date (datetime) build_path (varchar) environment (varchar)
1 1 mips Debian mips 2018-10-05T02:46:09+00:00 /build/maxima-ffBduW/maxima-5.42.0 DEB_BUILD_OPTIONS="parallel=2" LC_ALL="POSIX" SOURCE_DATE_EPOCH="1538247291"
binary_table: binary_id (Integer) \ package (varchar) version (varchar) architecture (varchar)
1 maxima 5.42.0-1 mips
dependency_table: buildinfo_id (Integer) \ build_id (Integer) \
1 23
output_table: buildinfo_id (Integer) \ build_id (Integer) \ checksum_md5 (varchar) checksum_sha1 (varchar) checksum_sha256 (varchar)
1 1 ['c671904988b053efb0e49405ad82511e 5736524 maxima_5.42.0-1_mips.deb', '6477b5fca4f2bfc6d09aae67f1efc9ca 485988 xmaxima_5.42.0-1_mips.deb'] ['50a417d7b6642250947730b23f173b08e00425dc 5736524 maxima_5.42.0-1_mips.deb', 'f8caa8d98ecfed3717738e0f4ada053b3683e7a5 485988 xmaxima_5.42.0-1_mips.deb'] ['d67b0a3b43f8c8cad5ff9b4e4c0120ad7c50021762d9a20560ed785ea0ab2eef 5736524 maxima_5.42.0-1_mips.deb', 'e065f3f443cecc14df0cc55a1df4be547f073dc5422a159cf9d69f97f04ef01d 485988 xmaxima_5.42.0-1_mips.deb']

popularity_table:

name (text) date (date) inst (integer) vote (integer) old (integer) recent (integer) no-files (integer) maintainer (text) inst_norm (varchar) vote_norm (varchar)
dpkg 02/09/2023 209081 192500 2847 13700 34 Dpkg Developers 195484.300615492 179981.575889164
maintainer: name (text) package (varchar) inst (integer) vote (integer) old (integer) recent (integer) no-files (integer)
Debian Gnome Maintainers libvaladoc-0.56-dev 25913899 8685994 8352785 2749012 6126108
SahithiKasim commented 1 year ago

Please check this description on the database @SantiagoTorres @sbrunswi and tell me required changes!

SahithiKasim commented 1 year ago

Updated in the readme