Closed stigtsp closed 8 months ago
It's possible, but let me think about how that might work.
Can you explain why you are keeping old ID's around so I can understand your use case? Are you ignoring some IDs because they aren't being fixed, for example?
My specific usecase is providing references to a record when requesting a CVE number. Since the CPANSA identifier is not stable, I need to provide links to the repo pinned by commit hash like this:
https://github.com/briandfoy/cpan-security-advisory/blob/9374f98bef51e1ae887f293234050551c079776f/cpansa/CPANSA-Plack-Middleware-XSRFBlock.yml#L2-L15
There would be other similar usecases as well, like if someone is referencing to a vuln in their workflow or discussion.
Why would you refer to this repo to request a CVE instead of using the primary sources in the links portion of a report in this repo? I'd say the same thing about anyone who wants to reference a vulnerability: use the primary sources instead of something someone has to clickthrough to get to the primary sources.
That doesn't mean that we can't have stable ID for other reasons though.
Why would you refer to this repo to request a CVE instead of using the primary sources..
I'm also using the primary sources, but CPANSA entries are still relevant references for context.
For the time being, we could have an optional "moved_from" key to indicate that the reference was renamed.
But I don't see a reason to rename the CPANSA entry just because we learn the CVE number later.
I imagine in the future that the CPAN Security people may assign a reference number, and we may want to use those.
I imagine in the future that the CPAN Security people may assign a reference number, and we may want to use those.
Our initial plan was to set up our own CPANSEC-XXXX-YYYY namespace, but I've suggested using CVEs instead to avoid the complexity.
I've been thinking about this. Even if there were a stable ID, how would you link to that through this repo in a way that's different than what you are doing right now? The files are still going to change and possibly move around, especially considering that the data format right now needs some updates. I think you are back to the same problem.
I've been renaming the IDs with the CVE because I think that's the most useful thing for the end user. They see the report ID in the output, and it's the info that gives them the shortest path to what they probably want to look at or use in other communication. It doesn't have to be that way, but I think it's better than carrying through some other identifier special to this database.
And, at the moment, I'm thinking through if anything should depend on the primary key of a record. Typically that's not good practice, but I haven't come up with an answer for that.
Suggestion: let's look at this through a database normalization perspective. How about having another table/file with CVE's as primary keys, each referring to a CPANSA id? That way we get a stable primary key for the CPANSA id space (which we'll need in order to predictably be able to refer to them), and we make it easier to find these if one only has a CVE id available...
[..] Even if there were a stable ID, how would you link to that through this repo in a way that's different than what you are doing right now? The files are still going to change and possibly move around, especially considering that the data format right now needs some updates. I think you are back to the same problem.
As long as there is some stable primary key for the records, (hyper) linking to them would be a solvable separate problem, imho.
There are already some tools that consume the data like Test::CVE, and I think setting up a website that provides stable links to these vulns might also be a good. We can prob do that.
I find it useful to be able to reference i.e. "CPANSA-Foobar-1" instead of "that Foobar vulnerability from yesterday affecting $somthing" in discussions as well.
Note: We have a feed set up on https://cpan-security.github.io/cpansa-feed/ that Test::CVE uses.
As long as there is some stable primary key for the records, (hyper) linking to them would be a solvable separate problem, imho.
"would be"? It sounds like there is no stable way that you are using right now. That was what I was asking about. This started because you did not like deep links into a particular commit where the URL had to refer to line numbers. I don't see how you'd be able to deep link into this repository without doing that, even if there were stable identifiers.
I find it useful to be able to reference i.e. "CPANSA-Foobar-1" instead of "that Foobar vulnerability from yesterday affecting $somthing" in discussions as well.
I think this is a bit specious. If I were in a discussion and wanted to refer to something, my first thought would not be to refer to the relative date. I'd say something like "the buffer overflow in foo() (CPANSA-Foobar-1)". Maybe that ID changes, but it's still easy for someone to find the report based on its content. It's also a much better reference when everyone has forgotten what the ID, no matter what it is, has forgotten what the report was about.
Suggestion: let's look at this through a database normalization perspective. How about having another table/file with CVE's as primary keys, each referring to a CPANSA id? That way we get a stable primary key for the CPANSA id space (which we'll need in order to predictably be able to refer to them), and we make it easier to find these if one only has a CVE id available...
If we were going to stick to stable IDs, there would still be a cves
array in each record. Something else can invert that, but that would be a generated table. The best way to go here is simple files the humans can ingest and work on. Higher order uses come from digesting these files as a data source.
We don't need to make this more complicated because that makes it harder for casual contributions.
Going forward, such as in #137, we will not update any of the values in the id
keys.
However, it is still not a good idea to link into this repo, as in https://nvd.nist.gov/vuln/detail/CVE-2022-48623. As I read through CVEs, I'm inundated with many, many links, and most of them are just references to each other with almost no original content. Don't add to that problem with another set of circular references.
Going forward, such as in #137, we will not update any of the values in the
id
keys.
Thanks! :grinning:
However, it is still not a good idea to link into this repo, as in https://nvd.nist.gov/vuln/detail/CVE-2022-48623. As I read through CVEs, I'm inundated with many, many links, and most of them are just references to each other with almost no original content. Don't add to that problem with another set of circular references.
Something for your consideration:
1) As long as CPANSA has any kind of authoritative information in it (which I assume it has, and will continue to have in the foreseeable future), then external entities need to be able to refer to it directly, using stable URLs and stable IDs. With your promise above, I think we're good! :partying_face: 2) Linking to a specific commit in the CPANSA is actually meaningful when there's a need to refer to specific line number(s) in a file as they are found at specific time (assuming the file at any point later may have lines added above the ones being referred to). So I guess this should be expected? 3) Circular references aren't really an issue here, are they? As long as we have stable endpoints to link to and we're clear about what their meaning and purpose, that ought to be enough to let users get an idea if the vulnerability is relevant for them?
In any case, thanks! :-D
Example: in https://github.com/briandfoy/cpan-security-advisory/commit/3b63755027f2016a7d3c96b8c1a8270f3c97d17d where
CPANSA-CPAN-2023-01
was changed toCPANSA-CPAN-2023-31484
afterCVE-2023-31484
was added to thecves
list.This makes it hard to provide a stable reference to the record pre-CVE. Is it possible to keep the previous reference as an alias, or not change the id?