google / deps.dev

Resources for the deps.dev API
https://deps.dev
Apache License 2.0
258 stars 20 forks source link

Handle requests for different spellings of a given version #7

Closed jamietanna closed 1 year ago

jamietanna commented 1 year ago

I've been recently working on tooling to improve visibility of internal + Open Source projects at https://dmd.tanna.dev/ and have just integrated deps.dev into my tooling, and noticed a bug.

For instance, when trying to resolve https://pypi.org/project/cryptography/2.7/ via the API, you'll notice that it fails unless we add the .0 suffix

% curl https://api.deps.dev/v3alpha/systems/PYPI/packages/cryptography/versions/2.7 -i
HTTP/2 404
content-type: application/grpc
grpc-status: 5
grpc-message: version not found
x-envoy-upstream-service-time: 9
strict-transport-security: max-age=2592000; includeSubDomains
content-length: 0
date: Fri, 14 Apr 2023 18:05:56 GMT
server: envoy
via: 1.1 google
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000

% curl https://api.deps.dev/v3alpha/systems/PYPI/packages/cryptography/versions/2.7.0 -i
HTTP/2 200
content-type: application/json
x-envoy-upstream-service-time: 13
strict-transport-security: max-age=2592000; includeSubDomains
grpc-status: 0
grpc-message:
content-length: 324
vary: Accept-Encoding
date: Fri, 14 Apr 2023 18:06:00 GMT
server: envoy
via: 1.1 google
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000

{"versionKey":{"system":"PYPI","name":"cryptography","version":"2.7.0"},"isDefault":false,"licenses":["non-standard"],"advisoryKeys":[{"id":"GHSA-hggm-jpg3-v476"},{"id":"GHSA-w7pp-m8wf-vj6r"},{"id":"GHSA-x4qr-2fvf-3mr5"},{"id":"PYSEC-2021-62"}],"links":[{"label":"SOURCE_REPO","url":"[https://github.com/pyca/cryptography"}]}%](https://github.com/pyca/cryptography%22%7D]%7D%)     

However, this shouldn't be required, as the version itself is set to 2.7.

I've been tracking this issue in my project in https://gitlab.com/tanna.dev/dependency-management-data/-/issues/74, if that's of use.

adg commented 1 year ago

Thanks for the report.

This is a general issue that we intend to address: some package managers permit multiple spellings for a given version string (depending on many contextual factors), but we only match requests against a specific version string that we store in the database. We could instead convert the requested version into a canonical form, and use that to match versions in the database.

(We already do this for package names - for example, requests for either Cargo/clap-builder or Cargo/clap_builder will return data about the same package (because deps.dev observes cargo's rules on names)).

adg commented 1 year ago

We've rolled out some changes that canonicalize the requested versions, so the example you give now works:

$ curl -s https://api.deps.dev/v3alpha/systems/PYPI/packages/cryptography/versions/2.7 | jq
{
  "versionKey": {
    "system": "PYPI",
    "name": "cryptography",
    "version": "2.7.0"
  },
  "isDefault": false,
  "licenses": [
    "non-standard"
  ],
  "advisoryKeys": [
    {
      "id": "GHSA-hggm-jpg3-v476"
    },
    {
      "id": "GHSA-w7pp-m8wf-vj6r"
    },
    {
      "id": "GHSA-x4qr-2fvf-3mr5"
    },
    {
      "id": "PYSEC-2021-62"
    }
  ],
  "links": [
    {
      "label": "SOURCE_REPO",
      "url": "https://github.com/pyca/cryptography"
    }
  ]
}

@jamietanna Thanks again for the report. Please let us know if encounter other issues. :)