aboutcode-org / vulnerablecode

A free and open vulnerabilities database and the packages they impact. And the tools to aggregate and correlate these vulnerabilities. Sponsored by NLnet https://nlnet.nl/project/vulnerabilitydatabase/ for https://www.aboutcode.org/ Chat at https://gitter.im/aboutcode-org/vulnerablecode Docs at https://vulnerablecode.readthedocs.org/
https://public.vulnerablecode.io
Apache License 2.0
526 stars 198 forks source link

API design and use case scenarios for VulnerableCode #217

Closed sbs2001 closed 4 years ago

sbs2001 commented 4 years ago

I want to know all possible ways, VulnerableCode will be utilised, so we can start coming up an API design for the REST API.

haikoschol commented 4 years ago

For a list (or tree) of dependencies, expressed as Package URLs, return a list of vulnerabilities (and pointers to further information) that affect those dependencies.

sbs2001 commented 4 years ago

For a list (or tree) of dependencies, expressed as Package URLs, return a list of vulnerabilities (and pointers to further information) that affect those dependencies.

That makes sense, a SBOM utility like cyclonedx can directly query for vulnerability status.

My question is should a single request contain a list of purls or require multiple requests each containing a distinct purl ? Both of these approaches will get the job done in this scenario.

sbs2001 commented 4 years ago

Add a cve-search except it won't be limited to CVEs, various advisory ids like RHSA should be searchable check https://github.com/nexB/vulnerablecode/issues/9 .

haikoschol commented 4 years ago

For a list (or tree) of dependencies, expressed as Package URLs, return a list of vulnerabilities (and pointers to further information) that affect those dependencies.

My question is should a single request contain a list of purls or require multiple requests each containing a distinct purl ? Both of these approaches will get the job done in this scenario.

I don't have a strong opinion on this and would go with whatever is easier to implement.

sbs2001 commented 4 years ago

@haikoschol @pombredanne how does this look ?

The endpoint would be api/packages/

The parameters would be our regular purl's field

for example the response for api/packages/?name=openssl&version=1.1.0j&type=generic would be.


   {
            "package_url": "pkg:generic/openssl@1.1.0j",
            "vulnerable_to": [
                {
                    "vulnerability": "CVE-2019-1552"
                },
                {
                    "vulnerability": "CVE-2019-1563"
                },
                {
                    "vulnerability": "CVE-2019-1547"
                },
                {
                    "vulnerability": "CVE-2019-1543"
                }
            ],
            "resolved_to": [
                {
                    "vulnerability": "CVE-2018-0734"
                },
                {
                    "vulnerability": "CVE-2018-0735"
                }
            ]
        }
    ]

We could add a hyperlink to the view which has details of the vulnerability

pombredanne commented 4 years ago

I am sure we can find a better name than "resolved_to" and "vulnerable_to" somehow that does not feel evident as names... Also why not exposing first and mostly a purl rather than the purl fields?

sbs2001 commented 4 years ago

@pombredanne

I am sure we can find a better name than "resolved_to" and "vulnerable_to"

Consider them as placeholders for now.

Also why not exposing first and mostly a purl rather than the purl fields?

That works too, are there any pros and cons of using one over another?

pombredanne commented 4 years ago

That works too, are there any pros and cons of using one over another?

a purl has been designed as a single simple string... exposing its parts is fine for query but that would not be the primary entry point IMHO

sbs2001 commented 4 years ago

@pombredanne

I'm torn between the 2 options of using string vs fields for a purl.

exposing its parts is fine for query but that would not be the primary entry point IMHO

How about wrapping the fields like this

 {
            "package": {
                              "name": "openssl",
                              "type": "generic",
                              "namspace": null,
                              "version":"1.1.0j" ,
                              "package_url": "pkg:generic/openssl@1.1.0j",....
                              },

            "vulnerable_to": [
                {
                    "vulnerability": "CVE-2019-1552"
                },
                {
                    "vulnerability": "CVE-2019-1563"
                },
                {
                    "vulnerability": "CVE-2019-1547"
                },
                {
                    "vulnerability": "CVE-2019-1543"
                }
            ],
            "resolved_to": [
                {
                    "vulnerability": "CVE-2018-0734"
                },
                {
                    "vulnerability": "CVE-2018-0735"
                }
            ]
        }
    ]
sbs2001 commented 4 years ago

I'm pretty sure the endpoint is wrong. We are getting packages + vulnerabilities , the url should denote that

sbs2001 commented 4 years ago

The vulnerability-package search would give response like

[
    {
        "package":{

            "name": "foo",
            "type": "deb",
            "version": "bar",
            "namespace": "ubuntu",
            "packageurl": "pkg:deb/ubuntu/foo@bar"
        },

        "vulnerabilities":[
            {
                "cve_id": "CVE-2002-1568",
                "href": "/vulnerabilities/1"
            }

            {
                "cve_id": "CVE-2002-1569",
                "href": "/vulnerabilities/2"
            }

        ]
    },

    {
        "package":{

            "name": "tar",
            "type": "deb",
            "version": "ball",
            "namespace": null,
            "packageurl": "pkg:deb/foo@bar"
        },

        "vulnerabilities":[
            {
                "cve_id": "CVE-2002-1578",
                "href": "/vulnerabilities/17"
            }

            {
                "cve_id": "CVE-2002-1579",
                "href": "/vulnerabilities/3"
            }

        ]
    }
]
pombredanne commented 4 years ago

@sbs2001 you wrote

Shivam Sandbhor: you don't like the name "href" or the concept of providing link to other api call to gather more details of the resource ?

I prefer full URLs than providing href

And as for providing a reference vs. embedding the results... you typically need both, but it is simpler (for the user) to hve things embedded for a start (unless you have too much data which would only make sense through another call)

tdruez commented 4 years ago

For the Package endpoint api/packages/, instead of (from https://github.com/nexB/vulnerablecode/issues/217#issuecomment-672781490):

[
    {
        "package": {...},
        "vulnerabilities": [...],
    }
]

A better data structure would be:

[
    {
          "name": "tar",
          "type": "deb",
          "version": "ball",
          "namespace": "",
          "packageurl": "pkg:deb/foo@bar",
          "vulnerabilities": [
              {
                  "cve_id": "CVE-2002-1578",
                  "url": "/vulnerabilities/17"
              },
              {
                  "cve_id": "CVE-2002-1579",
                  "url": "/vulnerabilities/3"
              }
          ]
    }
]

And for the api/vulnerabilities/ endpoint:

[
    {
        "cve_id": "CVE-2002-1579",
        "url": "/vulnerabilities/3"
        "packages": [
            {
                "name": "tar",
                "type": "deb",
                 "version": "ball",
                 "namespace": "",
                 "packageurl": "pkg:deb/foo@bar",
            },
        ]
    }
]
sbs2001 commented 4 years ago

@tdruez thanks, that works too and probably is easier to implement too. The issue is, for (package, vulnerability) relation, either the package is vulnerable to that vulnerability or some previous version of the same package was vulnerable to this vulnerability.

So we want to group all vulnerabilities for a package accordingly. I didn't consider adding the resolved vulnerabilities in https://github.com/nexB/vulnerablecode/issues/217#issuecomment-672781490 but we need to consider them. Something like https://github.com/nexB/vulnerablecode/issues/217#issuecomment-667113604 and your design would do the trick.

We need to group the packages accordingly so in api/vulnerabilities endpoint too.

sbs2001 commented 4 years ago

We definitely need to distinguish between the vulnerabilities which are impacting/non-impacting to a package.

Having said that using @tdruez 's data structure from https://github.com/nexB/vulnerablecode/issues/217#issuecomment-672863607 in a slightly different manner works.

[
    {
          "name": "tar",
          "type": "deb",
          "version": "ball",
          "namespace": "",
          "packageurl": "pkg:deb/foo@bar",
          "resolved_vulnerabilities": [
              {
                  "cve_id": "CVE-2002-1578",
                  "url": "/vulnerabilities/17"
              },
              {
                  "cve_id": "CVE-2002-1579",
                  "url": "/vulnerabilities/3"
              }
          ],
         "impactful_vulnerabilities": [
              {
                  "cve_id": "CVE-2005-1578",
                  "url": "/vulnerabilities/47"
              },
              {
                  "cve_id": "CVE-2005-1579",
                  "url": "/vulnerabilities/36"
              }
          ]
    }
]

@pombredanne please suggest a good name instead of impactful_vulnerabilities and resolved_vulnerabilities

pombredanne commented 4 years ago

I would use purl rather than packageurl to be consistent with ScanCode. The order of the purl fields matters too: always go from left to right and include them all: type, namespace, name, version, qualifiers, subpath As for the vulnerabilities here are some suggestions. ((think also about all the qualifiers you give of bugs and issues)

Also I thought we had agreed that we would use a CVE id when available and use a single field for these ID... but we cannot call that field cve_id... either id or vulnerability_id or any of these two with identifier (and we should be consistent throughout the code of course.

sbs2001 commented 4 years ago

As for naming I would use resolved/unresolved_vulnerabilities.

About cve_id :

We'll use vulnerability_id the value will be either a VULCODE or a CVE as discussed https://github.com/nexB/vulnerablecode/issues/232 depending upon whether CVE is available.

sbs2001 commented 4 years ago

Here's how it's gonna be.

/api/vulnerabilities/{pk} -> Gives details of vulnerability object with pk = pk . Response looks like https://github.com/nexB/vulnerablecode/issues/217#issuecomment-674384355 with minor name differences as suggested by https://github.com/nexB/vulnerablecode/issues/217#issuecomment-674439115

/api/packages/{pk} -> Gives details of package object with pk = pk . Response looks like

      {
            "id": 1,
            "references": [
                {
                    "source": "",
                    "reference_id": "",
                    "url": "https://github.com/RustAudio/rust-portaudio/issues/144"
                },
                {
                    "source": "",
                    "reference_id": "RUSTSEC-2016-0003",
                    "url": "https://rustsec.org/advisories/RUSTSEC-2016-0003.html"
                }
            ],
            "resolved_packages": [ {
                    "purl": "pkg:cargo/sodiumoxide@0.0.15",
                    "url": "http://127.0.0.1:8000/api/packages/131"
                }],
            "unresolved_packages": [ {"purl": "pkg:cargo/http@0.1.11",
            "url": "http://127.0.0.1:8000/api/packages/170"
        }],
            "cve_id": "CVE-2016-10933",
            "summary": "The build script in the portaudio crate will attempt to download via HTTP\nthe portaudio source and build it.\n\nA Mallory in the middle can intercept the download with their own archive\nand get RCE.\n"
        },

/api/vulnerabilities{vulnerability_id} -> Return objects which contain vulnerability_id . This is vulnerability search. Response is a LIST of responses like the one above.

/api/packages{purl} -> Returns package object with purl = purl .

/api/packages{name}{type}{ns}{qualifiers}{subpath}{versions} -> Returns package object with corresponding fields, each field is optional. List of responses as above.

sbs2001 commented 4 years ago

I have got the swagger docs working. @tdruez what should the endpoint be for the swagger docs ? ClearlyDefined has them at https://api.clearlydefined.io/api-docs/, not sure how nexB projects do it.

pombredanne commented 4 years ago

I think we are all set there. Closing!

sbs2001 commented 4 years ago

https://github.com/nexB/vulnerablecode/pull/247