chaoss / augur

Python library and web service for Open Source Software Health and Sustainability metrics & data collection. You can find our documentation and new contributor information easily here: https://oss-augur.readthedocs.io/en/main/ and learn more about Augur at our website https://augurlabs.io
https://oss-augur.readthedocs.io/en/main/
MIT License
583 stars 842 forks source link

GSoC Idea: Develop a Shared Data Resource Focused on Dependencies, Risk and Vulnerabilities in Open Source Software #1181

Closed sgoggins closed 1 year ago

sgoggins commented 3 years ago

The aim of this work is to understand the code based dependencies embedded within a piece of open source software. This metric explicitly excludes infrastructure focused dependencies like databases, and operating systems, which are defined in the “upstream infrastructure dependencies” metric.

Objectives

Implementation The expectation is that this would be implemented by using existing tools that examine package manager data for the languages in use (e.g., package.json for JavaScript npm, pyproject.toml / requirements.txt for Python, Gemfile / Gemfile.lock for Ruby). Ergo, dependencies will be analyzed using the project’s dependency file. This will be analyzed using dependency file in the project.

Note: C/C++ generally do not use system package managers. Things get more complex with multiple languages, insofar as several language specific dependency files will need to be scanned.

Micro-tasks and place for questions [will add link later]

Augur would be the tool that this is ultimately implemented in, although only as an accessed, shared data resources including informaiton form other tools, including:

Resource Link

Microtasks

For becoming familiar with Augur, you can start by reading some documentation. You can find useful information at in the links, below.

Once you're familiar with Augur, you can have a look at the following microtasks.

Dhruv-Sachdev1313 commented 3 years ago

Hello @sgoggins I have a little doubt regarding this project's objective. So is this project about developing a software that identifies dependencies and vulnerabilities? Or using the existing tools to develop a database for the same ?If you can explain it a little that would be really helpful.

Dhruv-Sachdev1313 commented 3 years ago

hello @sgoggins Does here implicit dependencies mean the dependencies not specified directly in package manager data ? So how do we get to know about those dependencies.

sgoggins commented 3 years ago

@Dhruv-Sachdev1313 : Simply put, there is more here than we can do in a summer. We will be guided by the Risk working group on what we choose. I think playing with the Libyear implementations against some augur data is a really good start!

sgoggins commented 3 years ago

There will be some design discussion at the start of this task for sure!!

sgoggins commented 3 years ago

This spreadsheet of the Risk working group's minimum viable metrics are a good place to start to understand what we will build first. https://docs.google.com/spreadsheets/d/1hNCtgHkA3uwB4OrUS0gXkXpOwWsJMbIBMzek3j_E3mw/edit#gid=855606942

ADI10HERO commented 3 years ago

@sgoggins Is it like creating GitHub's Dependabot? And, if it is, what differences are we looking for, like why not just use it? I hope its not too stupid a question 😅

sgoggins commented 3 years ago

@ADI10HERO I think any implementation of any small part of a tool that evaluates risk is going to be a sufficient. On this spreadsheet are a list of other tools to explore:

https://docs.google.com/spreadsheets/d/1hNCtgHkA3uwB4OrUS0gXkXpOwWsJMbIBMzek3j_E3mw/edit#gid=855606942

Risk working group information is located here: https://github.com/chaoss/wg-risk

sgoggins commented 3 years ago

I am leaving this issue open for the convenience of potential GSoC Students for the time being.