rubysec / ruby-advisory-db

A database of vulnerable Ruby Gems
https://rubysec.com
Other
1.02k stars 220 forks source link

Shared vulnerability format for vulnerabilities #471

Open oliverchang opened 3 years ago

oliverchang commented 3 years ago

Hi!

I work on the Google Open Source Vulnerabilities project, and we've been working with the Go security team and other vulnerability database maintainers to try to arrive at a common JSON-based format for describing basic metadata about vulnerabilities and links between them. The goal is to make it easier for language teams to publish vulnerabilities in a machine-readable format and to make it easier for security researchers and other cross-language projects to analyze and correlate that vulnerability information.

To that end, @rsc and I have prepared a doc describing a proposed format which can be found at https://tinyurl.com/vuln-json. Feedback is most welcome, preferably as comments on the doc.

The specific questions we are trying to answer right now are:

Thanks very much for any and all feedback!

reedloden commented 3 years ago

Hi @oliverchang!

Thanks for reaching out.

Is this an effort you are interested in participating in?

Yup, we'd be interested, especially if it means a reduction in duplicate work that is occurring today with so many different databases with different formats.

Does this format contain what your database would want to know from other databases?

I've quickly added some comments. Will re-read it sometime soon and add more comments as needed.

Would you be willing to make your database available in this format?

We're not tied to our current format (it has plenty of things I would like to improve/change), so if there's a generally-accepted standard, we're happy to transition. Though, we do have tooling that uses our format, so would need to figure out how to keep both formats for a while during the transition stage.

@postmodern any thoughts here?

postmodern commented 3 years ago

First off, thank you for reaching out to our little project.

Does this format contain what your database would want to know from other databases?

This database is primarily concerned with advisories affecting RubyGems and Ruby implementations. I did not see any examples of a RubyGem or Ruby advisory in the document, so I cannot give specific feedback. Based on the examples that I did see, the metadata does contain affected semver-version ranges which we could convert to RubyGems style version ranges; {"type": "SEMVER", "introduced": "1.15.0", "fixed": "1.15.17"} would become affected: ">= 1.15.0, < 1.15.17".

Would you be willing to make your database available in this format?

Considering that bundler-audit uses the current YAML schema, we must be careful about changing the schema. Also, while JSON is machine readable, it is not exactly human readable/writable, especially for data containing multi-paragraph text blocks. Where as plain YAML excels at representing multi-paragraph text blocks in a human readable/writable manner.

I am also hesitant about rushing to adopt or become dependent on a new format that might not become widely adopted or suddenly becomes unsupported. It also seems like a lot of work that we the maintainers would have to take on ourselves, especially for a new format which hasn't reached v1.0 yet and is still under active development. There would need to be a clear benefit to ruby-advisory-db and bundler-audit, in order to justify taking on the work of migrating to a new advisory format.

That said, the Google OSV team is more than welcome to help contribute back to ruby-advisory-db, by either submitting advisories or code contributions to help integrate the two advisory schemas.

oliverchang commented 3 years ago

Thank you very much for the feedback!

I added an example of a RubyGems advisory into the document.

Re changing the schema format, we're not necessarily asking everybody to use this format as the source of truth, only that databases export this JSON format so people can easily build ecosystem-agnostic tooling to consume vulnerabilities in all open source package ecosystems.

Regarding the format spec being widely adopted, the Go vulnerability database, Rust vulnerability database are committed to exporting this format, and we've very close to helping the Python community launch its own vulnerability database. We also have a DB of vulns found by our OSS-Fuzz service. We're hoping to finalise this as a V1 spec very soon after gathering more feedback.

That said, using this schema as the source of truth (as an alternate YAML encoding for easy hand-editing as you mention) has its benefits wrt common tooling for database maintainance. For Python, we've built out a workflow to minimize human triage and auto-generate entries in this format from other feeds such as NVD. This could be generalized to Ruby if this is something might help.

More generally, the OSV team is certainly very happy to help contribute changes here to help integrate this schema and with advisory generation as well.

Thanks!

oliverchang commented 3 years ago

Hi, just following up again here!

Would you be open to supporting this format as an export of the current ruby DB? We'd be happy to help with the implementation of this, as well as collaborating to find ways to make discovering new advisories easier (as we've done for Python). Perhaps there are other ways we could help fund the existing maintenance effort too -- happy to discuss more over email.

We also just released a blog post with some more background info: https://security.googleblog.com/2021/06/announcing-unified-vulnerability-schema.html

postmodern commented 3 years ago

I see one major hurdle based on the conversation in the JSON Schema Google doc: listing the impacted versions. The ruby-advisory-db uses RubyGem's version constraint syntax (ex: ~> 1.2.3, <= 1.4.0) to specify patched or unaffected version ranges. The JSON Schema prefers listing the impacted versions explicitly for RubyGems, due to RubyGems not fully supporting the SemVer version format. This would require querying the rubygems.org API to find all versions that do not full within the patched or unaffected version ranges.

While I am not 100% clear about the requirements of how to export the ruby-advisory-db, I suspect it could be done via GitHub Actions and a rake task. Perhaps a rake task(s) could be defined to convert every YAML advisory into the JSON Schema equivalent, commit the files, and push to an alternate branch which could be git pulled? This would also solve the problem of how to re-export the ruby-advisory-db anytime an existing advisory is amended or corrected. If this is something you or someone on your team would like to take on, definitely brainstorm here in the comments and maybe start working on a PR.

As far as funding, I will let the other @rubysec/core members chime in on how that could be handled. I think funding the work of advisory curation is the most sustainable option, as it is thankless and tiresome work that never stops. While we have setup automated tasks to sync the DB against GHSA, as well as linting checks and schema validation tests, some degree of human verification is still necessary as the vulnerability information is sometimes incomplete or inaccurate.

G-Rath commented 2 years ago

@oliverchang @postmodern just wanted to follow up on this because we're looking into adopting osv-detector at my work and while the GitHub Advisory Database seems to keep pretty up to date with the Ruby advisories there are occasions where this database has an advisory that doesn't make it into their database for a couple of weeks, so having this database in OSV format would be great as it could be consumed directly.

@postmodern your comment about the one major hurdle you saw being the versions seems to predate when the specification got support for "events" which I think means converting the advisories here into an OSV format might be very doable now since e.g. ~> 1.2.3, <= 1.4.0 can be represented as (assuming that represents the vulnerable versions):

{
  "id": "...",
  "affected": [
    {
      "package": { "ecosystem": "RubyGems", "name": "..." },
      "ranges": [
        {
          "type": "ECOSYSTEM",
          "events": [
            { "introduced": "1.2.3" },
            { "last_affected": "1.4.0" }
          ]
        }
      ]
    }
  ]
}

and for the latest rails-html-sanitizer vulnerability:

{
  "id": "CVE-2022-32209",
  "affected": [
    {
      "package": { "ecosystem": "RubyGems", "name": "rails-html-sanitizer" },
      "ranges": [
        {
          "type": "ECOSYSTEM",
          "events": [
            { "introduced": "0" },
            { "fixed": "1.4.3" }
          ]
        }
      ]
    }
  ]
}
reedloden commented 2 years ago

@oliverchang This was brought up to be in Vegas last week during a discussion about dependency management. We'd love to accept Google's assistance in modernizing this database and making it so there's not so much manual work. What would be good next steps for this?

oliverchang commented 2 years ago

Thanks for getting back! What form of assistance are you looking for?

Some potential ways we could help here:

oliverchang commented 12 months ago

Hi folks! Resurrecting an old thread here to see if there's still interest?

The OSV schema is currently being exported by many different ecosystems and DBs: https://github.com/ossf/osv-schema/blob/main/README.md. It would be amazing if we could add Ruby to this list.

reedloden commented 12 months ago

@oliverchang Either of the proposed options is fine. We're not tied to the current schema, and it actually causes headaches when we try to import things from other databases. The current team is just too busy to do this transition ourselves, but we're happy to help others where we can.

postmodern commented 12 months ago

@reedloden bundler-audit, which is used by many in CI, relies on the current schema. Changing it in the master branch would break bundler-audit. There's also the issue that OSV does not allow Rubygems style version dependencies, which bundler-audit uses to match versions.

@oliverchang if you just want a mechanism to export the ruby-advisory-db to OSV format, feel free to submit a PR. I am currently busy with other OSS projects and feel like OSV support should be handled by someone on the OSV team.