fkie-cad / nvd-json-data-feeds

Community reconstruction of the legacy JSON NVD Data Feeds. This project uses and redistributes data from the NVD API but is neither endorsed nor certified by the NVD.
109 stars 15 forks source link

Question: release auto-update script source code #1

Open henrirosten opened 1 year ago

henrirosten commented 1 year ago

Would it be possible to share the script you use to query NVD API and auto-update the data on this repository?

rhelmke commented 1 year ago

At some point we can certainly release the code auto-updating this repo. However we would like to make it a little bit more robust first :-). Please give us some time for that.

rhelmke commented 1 year ago

I'm glad to re-open this issue once we're ready!

henrirosten commented 1 year ago

Thanks, and also thanks for the detailed explanation in https://github.com/fkie-cad/nvd-json-data-feeds/issues/2.

The reason I'm asking for the script source is to be able to re-generate the data locally, or perhaps mirror the data in another repository in case you would happen to stop, for one reason or another, to cache the NVD data on this repo.

Anyway, thanks for the great work you are doing here!

yann-morin-1998 commented 7 months ago

@rhelmke Sorry to chime in this old issue: has there been any progress in making the mirroring scripts available?

Additionally, we would also be very much interested in the scripts that aggregates the individual CVEs into the daily feeds. Indeed, those feeds are short-lived; they are replaced daily. As such, there is no possibility to do reproducible builds.

For example, in our project, Buildroot, we are tracking a regression in our tooling, that occurred around 2024-02-07. Unfortunately, we can't validate when the issue actually happened, because the CVE feed from that day is no longer available. Since this is a git tree, we could easily reconstruct the feed from the individual entries, if the scripts were available.

rhelmke commented 7 months ago

Hello @yann-morin-1998,

unfortunately there is still no release timeline for the software stack driving this repo. We are currently occupied with a lot of other projects and wouldn't be able to allocate the required resources at this time - I'm sorry.

Either ways the packaging code wouldn't help you guys to reconstruct any daily packages from this repo's history. This is because the code also uses our OpenSearch backend and is no standalone script.

However, we certainly see and understand the issues you guys are faced with in terms of reproducibility. In fact, the idea to provide companion scripts that are able to reliably reconstruct historical packages has been around for a while. I assume that we could provide such a script and verify its correctness in manageable time. Give us maybe a week and we'll see what we can do :-).

rhelmke commented 7 months ago

On another note, we also thought about not wiping historical release packages, but refrained from the idea because it would certainly create a lot of duplicate data to host. And it is truly unnecessary considering that a companion script could use the git history for reconstruction.

yann-morin-1998 commented 7 months ago

@rhelmke Thanks for the feedback, and thanks for considering our request. That's very much appreciated.

How open are you to contributions? I have been playing on a little python script here, that walks the individual CVE directories in the repository, and generates reproducible yearly archives. It's working now, and just needs a little eye-candy. Shall I open a PR?

rhelmke commented 7 months ago

@yann-morin-1998 thank you very much! We're of course open to PRs and would really appreciate it. But I'm not quite sure if this is the right repository for it. I thought about a tool that would take an ISO date as input, automatically clone the repo, check out the correct commit, and then recreate the packages. It might be better to move the script to another repo such that it does not have to sit in the working file tree.

Let me talk to a colleague of mine, he might be able to quickly throw together a python package for that. I'll (or he'll) let you know how he'd like to proceed :-).