rust-lang / crates.io

The Rust package registry
https://crates.io
Apache License 2.0
2.99k stars 601 forks source link

Expose dependency renames so that enabled features make sense #1539

Open dtolnay opened 6 years ago

dtolnay commented 6 years ago

As of https://github.com/rust-lang/cargo/pull/4953, Cargo supports renaming dependencies from Cargo.toml. For example the following dependency specification would introduce a dependency on a crate whose real name is tarpc-lib, but referring to it as rpc within the current crate.

[dependencies]
rpc = { package = "tarpc-lib", version = "0.1" }

This introduces the potential for feature names that cannot be meaningfully interpreted through the API of crates.io. Consider tarpc 0.13.0:

$ curl https://crates.io/api/v1/crates/tarpc/0.13.0
{
  "version": {
    "id": 113315,
    "crate": "tarpc",
    "num": "0.13.0",
    ...
    "features": {
      "serde1": [
        "rpc/serde1",
        "serde",
        "serde/derive"
      ]
    },
    ...
  }
}
$ curl https://crates.io/api/v1/crates/tarpc/0.13.0/dependencies
{
  "dependencies": [
    ...
    {
      "id": 529683,
      "version_id": 113315,
      "crate_id": "tarpc-lib",
      "req": "^0.1",
      "optional": false,
      "default_features": true,
      "features": [],
      "target": null,
      "kind": "normal",
      "downloads": 0
    },
    ...
  ]
}

There is no indication that the "rpc/serde1" feature depended on by tarpc's "serde1" feature refers to the dependency on tarpc-lib. This information lives only in Cargo.toml.

I would like to see renames listed in the response of /crates/:crate_id/:version/dependencies so that the complete dependency graph can be understood without downloading the crate's source.

jtgeibel commented 6 years ago

We obtain this data from cargo in JSON format when the crate is published. This data is also passed to the index, so there are constraints on possible changes. Here is how this looks in the index:

{
  "name":"tarpc",
  "vers":"0.13.0",
  "deps":[
    {
      "name":"log",
      "req":"^0.4",
      "features":[

      ],
      "optional":false,
      "default_features":true,
      "target":null,
      "kind":"normal"
    },
    {
      "name":"rpc",
      "req":"^0.1",
      "features":[

      ],
      "optional":false,
      "default_features":true,
      "target":null,
      "kind":"normal",
      "package":"tarpc-lib"
    },
    ...
   ],
  "cksum":"bbeb0a79553718585c855186a77d257deed78d5f06c47d7f840d6a1c46864bac",
  "features":{
    "serde1":[
      "rpc/serde1",
      "serde",
      "serde/derive"
    ]
  },
  "yanked":false,
  "links":null
}

So cargo and the index track the name of each dependency separate from the package. The API tracks everything via crate_id. If only name is present (like with log), then it matches the crate_id. If the dependency is renamed (like with rpc), then cargo puts the new name in name and the crate_id in package.

My preferred solution would be for cargo to send us two feature lists. The existing one is for when cargo is talking to itself via the index. If the new key is also provided, then this is recorded in the database; otherwise we fallback to the existing key.

This does have the disadvantage that it will take time to reach stable and older clients will continue to publish sub-optimal metadata, but I think that would be better than coupling to the index format and rewriting the features list on this end.

saecki commented 2 years ago

Is there anything blocking this right now?

dtolnay commented 2 years ago

Fixed by #5091, I believe.

Turbo87 commented 2 years ago

not entirely yet. we still need to expose the data on the API level and backfill the database, but it should be done soon :)

dtolnay commented 2 years ago

Oh good call, I forgot this issue is about the JSON API and not the DB dumps, since I'd switched my use case over to use DB dumps a long time ago. Reopening.