Open hogo6002 opened 3 months ago
As someone that does not have a view of the whole osv.dev machinery, I think the main question is, why would you want to combine CVE entries?
why would you want to combine CVE entries?
The initial reason was that Alpine SecDB uses CVE IDs to publish their vulnerabilities. So, we created a tool to combine their data into our existing CVEs, as long as they use the same vulnerability ID. Then, we wanted to add Debian security tracker data into our OSV to better support container scanning, so we continued to combine everything into our CVE entries to keep it consistent, as the Debian security tracker also uses CVE IDs.
However, the problem now is that we continue to combine more and more data into a single record, increasing the difficulty of our maintenance. So, we are considering some better changes to this approach.
why would you want to combine CVE entries?
The initial reason was that Alpine SecDB uses CVE IDs to publish their vulnerabilities. So, we created a tool to combine their data into our existing CVEs, as long as they use the same vulnerability ID. Then, we wanted to add Debian security tracker data into our OSV to better support container scanning, so we continued to combine everything into our CVE entries to keep it consistent, as the Debian security tracker also uses CVE IDs.
However, the problem now is that we continue to combine more and more data into a single record, increasing the difficulty of our maintenance. So, we are considering some better changes to this approach.
From your first message, the thing that is not clear to me is, if you are not going to combine CVEs anymore, then why would you need to still add the prefixes (e.g. Debian-CVE-2024-0001
) ? Aren't those completely separate buckets?
why would you want to combine CVE entries?
The initial reason was that Alpine SecDB uses CVE IDs to publish their vulnerabilities. So, we created a tool to combine their data into our existing CVEs, as long as they use the same vulnerability ID. Then, we wanted to add Debian security tracker data into our OSV to better support container scanning, so we continued to combine everything into our CVE entries to keep it consistent, as the Debian security tracker also uses CVE IDs. However, the problem now is that we continue to combine more and more data into a single record, increasing the difficulty of our maintenance. So, we are considering some better changes to this approach.
From your first message, the thing that is not clear to me is, if you are not going to combine CVEs anymore, then why would you need to still add the prefixes (e.g.
Debian-CVE-2024-0001
) ? Aren't those completely separate buckets?
For OSV, ID prefixes must be unique, and be associated with a home database. Debian security tracker, Alpine etc didn't have a unique ID before, so we were combining them into a single CVE record. If we don't combine them, then they would need their own prefix.
why would you want to combine CVE entries?
The initial reason was that Alpine SecDB uses CVE IDs to publish their vulnerabilities. So, we created a tool to combine their data into our existing CVEs, as long as they use the same vulnerability ID. Then, we wanted to add Debian security tracker data into our OSV to better support container scanning, so we continued to combine everything into our CVE entries to keep it consistent, as the Debian security tracker also uses CVE IDs. However, the problem now is that we continue to combine more and more data into a single record, increasing the difficulty of our maintenance. So, we are considering some better changes to this approach.
From your first message, the thing that is not clear to me is, if you are not going to combine CVEs anymore, then why would you need to still add the prefixes (e.g.
Debian-CVE-2024-0001
) ? Aren't those completely separate buckets?For OSV, ID prefixes must be unique, and be associated with a home database. Debian security tracker, Alpine etc didn't have a unique ID before, so we were combining them into a single CVE record. If we don't combine them, then they would need their own prefix.
Thanks for clarifying it, and would the prefix also be necessary for the filename itself? or could we have a filename CVE-2024-0001.json and inside of it id = "UBTU-CVE-2024-0001
?
Thanks for clarifying it, and would the prefix also be necessary for the filename itself? or could we have a filename CVE-2024-0001.json and inside of it
id = "UBTU-CVE-2024-0001
?
We do prefer the filename to match the ID name to keep it consistent for other ecosystems. Are there any concerns for Ubuntu in changing the filename?
Thanks for clarifying it, and would the prefix also be necessary for the filename itself? or could we have a filename CVE-2024-0001.json and inside of it
id = "UBTU-CVE-2024-0001
?We do prefer the filename to match the ID name to keep it consistent for other ecosystems. Are there any concerns for Ubuntu in changing the filename?
No concerns, just confirming what needs to be done as it will be a mass operation.
The only thing to confirm is which should be the prefix, Ubuntu-
, UBUNTU-
, UBTU-
?
The only thing to confirm is which should be the prefix, Ubuntu-, UBUNTU-, UBTU-?
Thanks for confirming! We prefer UBUNTU-
to align with our current naming schema. Same as https://osv.dev/vulnerability/CURL-CVE-2024-7264
Hey @dodys, I have two questions about Ubuntu CVE OSV records (ubuntu-security-notices/osv).
First, I'm curious why some affected packages lack the ubuntu_priority
field. This field is from the vulnerability level and should be consistent across all affected packages. But I've noticed that some packages are missing this field while others within the same vulnerability have it. Example: https://osv.dev/vulnerability/CVE-2024-21131
Second question is about the status
field. I can't find it in the Ubuntu OSV records, not sure if it's an important one. But I'm wondering if there's a filtering mechanism that determines which records Ubuntu sends to OSV (like not sending all records with a status of DNE
or not-affected
?).
Hey @dodys, I have two questions about Ubuntu CVE OSV records (ubuntu-security-notices/osv). First, I'm curious why some affected packages lack the
ubuntu_priority
field. This field is from the vulnerability level and should be consistent across all affected packages. But I've noticed that some packages are missing this field while others within the same vulnerability have it. Example: https://osv.dev/vulnerability/CVE-2024-21131
Indeed, for now we only added the ecosystem_specific
part (including ubuntu_priority
) when the CVE was fixed. In the next batch of update to our scripts, we plan to add the ecosystem_specific
to all.
Second question is about the
status
field. I can't find it in the Ubuntu OSV records, not sure if it's an important one. But I'm wondering if there's a filtering mechanism that determines which records Ubuntu sends to OSV (like not sending all records with a status ofDNE
ornot-affected
?).
Since OSV has the affected
field, which is for only packages that are vulnerable to such "vulnerability" (CVE) or "set of vulnerabilities" (USN), for CVEs, we only list the ones that are affected or are still pending investigation. That means that the following status are always included:
needed
-> means vulnerable/affected
needs-triage
-> means that it needs investigation, therefore we consider it vulnerable until investigation is done
deferred
-> means that we tried to investigate but we still don't have enough information to do so, therefore it is vulnerable until confirmed
pending
-> means that the package is vulnerable and a fix is on its way
released
-> means that we patched it in such release
Some status are more complex and will depend on other factors:
not-affected (<version>)
-> if the <version>
exists in that specific Ubuntu release, we consider it similar to released
, if the <version>
is an old one, e.g. the first version to contain the fix and not vulnerable anymore, then we don't add it since that Ubuntu release is not vulnerable at all.
ignored
-> it will only be included for releases that are still "alive", as ignored can be used for different reasons.
And some status we just don't include:
DNE
-> package doesn't exist in such Ubuntu release
active
-> even though it is in the docs we still don't use that status.
Does that help clarify?
Does that help clarify?
Thanks Eduardo! It's very detailed and helpful!
Having all the ubuntu_priority
info should be sufficient for us, especially considering you've already helped us filter out DNE
and some not-affected
ones.
I forgot to ask one thing Holly
For the USN data, should we change the related field to refer to the UBUNTU-CVE-...
or should we keep it as CVE-...
?
I forgot to ask one thing Holly For the USN data, should we change the related field to refer to the
UBUNTU-CVE-...
or should we keep it asCVE-...
?
Are you referring to data from USN notes such as USN-6969-1
? I think for this type of data, we can just keep its original USN-
prefix, no change needed. Just for Ubuntu CVEs, we can add UBUNTU-CVE-
as its prefix.
Misread the question (didn't see the related
word :sweat_smile:), please refer to Oliver's answer
Hmm, I think @dodys is asking about what to put inside related
for USN-
entries? I think under our schema definitions, it would make the most sense to:
UBUNTU-CVE-
in aliases
, if it exists, given that they're both Ubuntu-specific advisories talking about the same vulnerability.CVE-
in related
. Thanks both!
Holly and Oliver, I've just pushed the CVE changes to our github repo, let me know if now looks as expected and the sync to osv.dev can be re-enabled
Hey @dodys, thanks for the Ubuntu CVE updates! It looks good overall. One thing to mention is the aliases
/related
part.
I noticed that CVEs-
are added as aliases to UBUNTU-CVE-
, but Oliver previously mentioned we wanted to put UBUNTU-CVE-
in USN-
aliases. Doing both would make OSV treat the CVE-
and USN-
as aliases of each other because aliases are considered symmetric and transitive. This is not exactly what we want (see https://github.com/google/osv.dev/issues/2374#issuecomment-2216588085 for explanation). We're trying to avoid making Linux distro vulnerabilities aliases of CVEs.
In the current OSV USN-
data, the aliases field is still empty, so it will not cause the problem. But putting CVE-
in UBUNTU-CVE-
aliases is not ideal because we think CVE-
and UBUNTU-CVE-
are not always the same thing and shouldn't be aliases to each other. ("Aliases should not be used to refer to vulnerabilities in packages upstream or downstream in a software supply chain from the given OSV record’s affected package(s).").
I talked with Oliver, and we think there is no perfect solution at this point. But to keep things consistent with other Linux distros, we think it's best to put all vulnerabilities in related
instead of aliases
. We might find a better way to show the relationship between Linux distro CVEs and CVEs later on.
With all that said, we don't need to change the current USN data as it doesn't contain any aliases. The only change we probably want is to move CVE-
to the related
field of UBUNTU-CVE-
.
ok, CVE files fixed and PR open: https://github.com/ossf/osv-schema/pull/265
Let me know if anything needed.
ok, CVE files fixed and PR open: ossf/osv-schema#265
CVE files look good to me. Thanks for fixing them!
Thanks for verifying Holly! Let me know if you need anything else from my side. I believe for the sync to be re-enable is just a matter of addressing the ignore pattern: https://github.com/google/osv.dev/blob/master/source.yaml#L303C20-L303C36 But I will leave it to you if you want to do some testing before re-enabling it, so we don't create another issue for you.
But I will leave it to you if you want to do some testing before re-enabling it, so we don't create another issue for you.
Hey Eduardo, we have re-enabled UBUNTU-CVE-
back to OSV.dev, and the linux query timeout issue has been fixed. But I am also wondering why some CVEs don't have purl
field while some have it?
But I will leave it to you if you want to do some testing before re-enabling it, so we don't create another issue for you.
Hey Eduardo, we have re-enabled
UBUNTU-CVE-
back to OSV.dev, and the linux query timeout issue has been fixed. But I am also wondering why some CVEs don't havepurl
field while some have it?
Hi Holly, sorry, but I am confused, both examples you shared have purl. Was it a wrong paste?
Hi Holly, sorry, but I am confused, both examples you shared have purl. Was it a wrong paste?
Oh yeah, sorry. That was a wrong paste. Example without purl: https://github.com/canonical/ubuntu-security-notices/blob/main/osv/cve/2023/UBUNTU-CVE-2023-0120.json, https://github.com/canonical/ubuntu-security-notices/blob/main/osv/cve/2023/UBUNTU-CVE-2023-1296.json, https://osv.dev/vulnerability/UBUNTU-CVE-2023-46726, https://osv.dev/vulnerability/UBUNTU-CVE-2024-8354 (it seems all come from Ubuntu:Pro?)
Hi Holly, sorry, but I am confused, both examples you shared have purl. Was it a wrong paste?
Oh yeah, sorry. That was a wrong paste. Example without purl: https://github.com/canonical/ubuntu-security-notices/blob/main/osv/cve/2023/UBUNTU-CVE-2023-0120.json, https://github.com/canonical/ubuntu-security-notices/blob/main/osv/cve/2023/UBUNTU-CVE-2023-1296.json, https://osv.dev/vulnerability/UBUNTU-CVE-2023-46726, https://osv.dev/vulnerability/UBUNTU-CVE-2024-8354 (it seems all come from Ubuntu:Pro?)
We are only adding purl
for vulnerabilities that were fixed. We could change that, if that's what is preferred :)
We are only adding
purl
for vulnerabilities that were fixed. We could change that, if that's what is preferred :)
Yes, please. We also prefer the purl
to not contain affected range information, which means it shouldn't include a version. Our purpose in using the purl
field is to find the package itself. For example, in other ecosystems, you might see a purl like pkg:deb/debian/chromium?arch=source
.
So for Ubuntu, regardless of whether a fix is available, the package purl
should only contain the package information (distribution information is fine to keep).
Original one: pkg:deb/ubuntu/py7zr@0.11.3+dfsg-4ubuntu0.1?arch=src?distro=jammy
Expected one: pkg:deb/ubuntu/py7zr?arch=src
or pkg:deb/ubuntu/py7zr?arch=src?distro=jammy
@michaelkedar do you have anything to add?
We are only adding
purl
for vulnerabilities that were fixed. We could change that, if that's what is preferred :)Yes, please. We also prefer the
purl
to not contain affected range information, which means it shouldn't include a version. Our purpose in using thepurl
field is to find the package itself. For example, in other ecosystems, you might see a purl likepkg:deb/debian/chromium?arch=source
.So for Ubuntu, regardless of whether a fix is available, the package
purl
should only contain the package information (distribution information is fine to keep).Original one:
pkg:deb/ubuntu/py7zr@0.11.3+dfsg-4ubuntu0.1?arch=src?distro=jammy
Expected one:pkg:deb/ubuntu/py7zr?arch=src
orpkg:deb/ubuntu/py7zr?arch=src?distro=jammy
@michaelkedar do you have anything to add?
Thanks Holly, I've made the changes and refreshed the data.
Thanks Holly, I've made the changes and refreshed the data.
Thanks Eduardo! It all looks good now!
This issue has not had any activity for 60 days and will be automatically closed in two weeks
See https://github.com/google/osv.dev/blob/master/CONTRIBUTING.md for how to contribute a PR if you're interested in helping out.
If I understand correctly, in order to call this issue "done", we'd have to render combine-to-osv
obsolete, which would mean:
nvd-cve-osv
supports doing today)
The current OSV structure combines vulnerability data from different resources (e.g., NVD, Alpine, Debian) into a single CVE entry based on shared CVE IDs. This approach leads to overly large and difficult to maintain CVE entries. With Ubuntu also publishing its security tracker data to OSV.dev using CVE IDs, this issue will likely worsen.
We probably need a better solution for managing CVE entries. One idea is to add a source-specific prefix or suffix to the CVE ID, creating separate entries for each data source. For example,
Alpine-CVE-2024-0001
andDebian-CVE-2024-0001
would be displayed as two distinct records on OSV.dev.