Open christophetd opened 3 months ago
Hi @christophetd that's super cool, thank you for considering contributing to OSV.dev. I'm interested in how your data set compares and contrasts with that of the OpenSSF's Malicious Packages project?
The onboarding process is a little bit bespoke and toilsome at the moment, but something we're continuously improving on and streamlining with each new data source onboarded. I would like to get it to the point of being much more checklist/cookbook driven than it currently is. My detailed response here is an experiment at a further process improvement and seeks to address some recent actionable feedback received by another data source onboarding. Your actionable feedback is also very welcome.
In a nutshell:
Known onboarding rough edges:
source{,_test}.yaml
files (hopefully the example PRs plus other existing entries will make this reasonably self-evident). Specifically, FYI, the value for type
corresponds with those defined at https://github.com/google/osv.dev/blob/381f459de12e181447731beee9ba4b06a513c586/osv/models.py#L783-L787Hi @christophetd - I work on the Malicious Packages repository.
Your dataset could be included in the Malicious Packages data set, and I would be happy to work with you on doing that.
One idea I had is that we could add a GitHub action to the repository that walks the packages and transforms them into OSV. We could then call the action from a workflow inside Malicious Packages to ingest the reports.
Let me know what you think
Caleb
@calebbrown @christophetd have you connected off-issue to determine the most appropriate integration point for these advisories?
Hi there!
I'm part of Datadog, where we publish and keep up to date a dataset of human-confirmed malicious npm and PyPI packages: https://github.com/DataDog/malicious-software-packages-dataset/
Eager to discuss what would be the best way to bring that into osv.dev