CPAN-Security / cpan-advisory-database

5 stars 1 forks source link

CVEs on triage #8

Closed garu closed 9 months ago

garu commented 9 months ago

this PR merges the execution of cpansec-admin cvescan on this repository. It adds all CVE entries from the beginning of time to today that match "perl" or "cpan".

It also updates the README and CONTRIBUTING files to reflect the current pipeline.

Note for reviewers: you don't have to review each yaml files on triage, they were created automatically. Just look at the README and CONTRIBUTING, then at the general structure of the yaml files, and let me know if you feel something should be changed.

stigtsp commented 9 months ago

Looks like a good tool! Some questions:

garu commented 9 months ago

@stigtsp both "triage" and "publish" commands take an optional list of items to work on instead of the whole folder. So you can do something like cpansec-admin triage 2020* and it should DWYW.

About ids, today the cpansec-admin publishcode would create an id using the current year, but I think you raise a fairly good point. Ideally, since we are a new index, we should date issues from here onwards. But because we want to document past issues, I think we should, as an exception, date them to their CVE dates. An alternative would be to make an explicit "legacy" id like CPANSEC-LEGACY-00001 but honestly I think it just adds more overhead to the whole thing.

I'll bring this to the chat so more people can chime in if they want.

sjn commented 9 months ago

Yeah, a CPAN "legacy" ID makes little sense. Instead, make sure that the ID includes a date component (e.g. year, as you suggest, or maybe include week number), and a serial number. This means past issues just get an appropriate date as part of it's ID, and isn't tainted by the "legacy" concept (which isn't a useful/constructive term to bring into such a topic anyway).

stigtsp commented 9 months ago

It's also an option to "start from now" imho. Only assigning identifiers to vulnerabilities that are unfixed or new. Assigning CPANSEC-2023-XXXX to a vuln that was fixed in 1999 seems weird.

sjn commented 9 months ago

CPANSEC-1999-XXXX for 1999 bugs. The real question is whether or not old vulnerabilities like these are still out in the wild somewhere. While people work on figuring out this (code archaeology) in their legacy code base, the bugs still have to be identified uniquely.

stigtsp commented 9 months ago

One option is to drop the year from the identifier, like Arch does for example. This avoids the problem in my opinion.

Like: CPANSEC-12345

garu commented 9 months ago

Right, but no year makes it harder to reason over them. Quick, how many reports we had in 2020?

We'd need to open every file, then parse and count a custom key (because the current OSV format only has a "published" key pertaining our own reports publish date so it will be 2023 onwards). We can't even slice the numbers (e.g. open only from CPANSEC-1034 to CPANSEC-3992") because tomorrow we'll find another issue first reported in 2013 and it will get bigger id. By keeping the number, we'd just bump the "2013-NNNN" group.

Makes sense?

giterlizzi commented 9 months ago

It would be useful to have the year of detection of the CVE (or vulnerability) or for the year present in the CVE prefix. After the prefix CPANSA-year a progressive number.