Open jonathanstrong opened 2 years ago
Providing a --token
command line argument is problematic since it may be necessary to specify tokens for multiple registries. For example, cargo build
could need to pull down crates from multiple registries.
Setting the token via --config
has similar security implications as the existing --token
option does, so I'm not sure why one is allowed but the other is not. Unless we're planning on deprecating --token
.
Do you have a suggestion for how the docs for the environment variable based approach could be improved?
Unless we're planning on deprecating
--token
.
We may consider deprecating --token
in the far future. We are trying to move away from passing secrets on the command-line, due to the relatively high risk of leaking a secret.
regarding --token
and potential future deprecation: without diving headlong into a larger discussion, there are a number of aspects to the cargo authentication story that make it pretty difficult to manage. I think the experience for the user is better than for the registry server operator, but not by a lot. one of the biggest issues is, there's two authentication "venues" -- auth for cloning/pulling the git crate index, and then auth for cargo http requests. git/ssh authentication is pretty fragile, and there are magic tricks like git-fetch-with-cli
you have to discover. if you make it that far you have to (currently) contend with no auth token sent along with download requests, which happily is getting fixed. if I could play God and re-design this from scratch, I'd cut out the git repo (instead ask the server provide updated crate versions, like the "sparse" rfc), condense the "api" and "dl" endpoints to one, ditch the dichotomy of "registry" name vs "index" url so there's only one true way to refer to a registry, use private key + signed requests for auth, and always sign every request, no matter what.
@arlosi when I just did a search to review what you will find on the subject and came across this example, which seems like it is pretty well-placed:
CARGO_REGISTRIES_MY_REGISTRY_INDEX=https://my-intranet:8080/git/index
Not sure how I missed that before.
I do think it's a tad complicated that you have to transform the registry name into screaming snake case, and I wonder how robust that is for all the edge cases.
It also highlights a point of uncertainty between the "registry" name and "index" url. This generally works based on people using the same name as the "registry" in Cargo.toml, but what happens if people use different names, which they are using to point to the same index?
I've updated the error message in #10592 to include a line suggesting which environment variable to set. Hopefully that helps with discoverability.
For example:
[UPDATING] `alternative` index
[ERROR] failed to download `bar v0.0.1 (registry `alternative`)`
Caused by:
no token found for `alternative`, please run `cargo login --registry alternative`
or use environment variable CARGO_REGISTRIES_ALTERNATIVE_TOKEN
I just had a chance to look at this again, now with the benefit of the -Z registry-auth
feature. specifically, it was in the context of changing our rustdoc builder to use -Z registry-auth
instead of our previous, hacky non-standard solution of passing the auth token via the user-agent header.
my primary purpose of writing this comment is to thank @arlosi for having the foresight to include this code, which warns the user if an alternative registry's name has been changed via env vars:
// If the name doesn't match, leave a note to help the user understand
// the potentially confusing situation.
that message saved me hours, I expect!
the context was building rustdocs for crate versions published by users for a private registry. like docs.rs this is done in a sandboxed environment (via rustwide crate) where network is not permitted during cargo doc
execution (i.e. by the build.rs
scripts).
it may sound somewhat absurd, but in this situation you don't necessarily know what the "name" of the alternative registry is, because the user could have used any name when configuring their registry in ~/.cargo/config.toml
. Also, relying on name alone is somewhat problematic (in my mind, and this is more based on instinct) in creating surface area for name ambiguity errors/security bugs/etc. when compared to the url of a specific git repo/crate index - which 1) requires a ssh key to read, and 2) is fixed and under our control as server administrator.
due to the constraints, it's a bit challenging, and I did not have a good sense of how cargo maps registry name to the crate index url internally - whether it is many-to-one or one-to-one. I had read a while back about "canonical url" in a RFC so I knew there had been some thought about the issue.
one source of confusion/difficulty was that the mapping between a registry name and the screaming snake case version of it is lossy, so my initial attempts did not line up as I expected. for example, I think I expected a env var like CARGO_REGISTRIES_SHIPYARD_RS_INDEX
to refer to a registry named "shipyard-rs" (with a hyphen), but it was converted to "shipyard_rs" (with an underscore instead).
the other thing that tripped me up (I was saved by another excellently-placed error message) was accidentally including lower case characters in the env var name.
I was using the account uuid for a temporary registry "name", and the lower-hex encoding resulted in an env var name that was not all uppercase, and thus rejected by cargo. i.e. an env var like CARGO_REGISTRIES_SHIPYARD_8ad420b85dbd4631be4da6b04da87250_INDEX
is rejected by cargo due to lowercase characters.
once I figured out both the underscore/hyphen and casing issues I was able to override the registry name <=> index mapping (via CARGO_REGISTRIES_<name>_INDEX
env var) and then provide an auth token for the (renamed) registry using the CARGO_REGISTRIES_<name>_TOKEN
env var) and it worked as expected. by using those env vars, I was able to specify a temporary name to refer to a specific registry with a given index url (e.g. ssh://git@shipyard.rs/<org-name>/crate-index.git
) and then provide an auth token for that registry specifically.
I will say, however, that the current method of using the name of the env var to specify the registry is less than ideal and prone to confusion and unexpected outcomes. also it's a bit messy that you use two separate env vars, but they are somewhat tied together -- e.g. you can't set the _TOKEN
env var before the _INDEX
env var or cargo won't know about a registry with that name yet.
the thing that I was really pining for as I was in the midst of this was something like
CARGO_SET_REGISTRY='{ name = "my-registry", index = "ssh://...", token = "abc..", ssh-key = "3Bln.." }'
where you could specify these things using a less error-prone "protocol", if you will. I'm not sure if there are any places that cargo does something along those lines, but if so, perhaps they could be used as a pattern of passing several fields of data in one env var without needing to wonder about how the env var name will be parsed.
I'm sure that the example above wouldn't work as is (might need to permit a list, for example), but hopefully it might help start a discussion about a better way to pass this data to cargo via env vars.
Thanks for the feedback @jonathanstrong. It's exciting to hear you're using the feature!
Dealing with the situation of not knowing the registry name was one of the most difficult aspects of the -Z registry-auth
You did discover the solution I intended (using the _INDEX
and _TOKEN
pair of variables). However, I recognize that it's not ideal.
Another direction I considered was using a different mechanism for injecting the registry information into Cargo's config. The environment variable system is problematic for a couple of reasons:
.
vs _
vs -
, etc)Other existing options include:
--config
command line option set cargo --config registries.my-registry.index=...
--config
command line option combined with -Zconfig-include
to specify a file --config config.toml
We could also consider a system for injecting config via TOML in an environment variable (which is closer to what you suggested). This would probably need an RFC.
CARGO_CONFIG="[registries]
my-registry = { index = '...', token = '...' }"
The
--config
command line option combined with-Zconfig-include
to specify a file--config config.toml
that would be really powerful! the only thing I wonder about is whether it's possible to use that approach to specify the token as well (vs. needing to use the env var for that). not sure how that works since tokens are normally in a separate credentials.toml file, also I previously ran into case where using config passed via command line flag does not allow credentials.
I feel like this discussion is even more pertinent with asymmetric tokens on the horizon -- which I expect will require specifying additional fields for the registry config.
Problem
While testing out the RFC 3139 implementation for authenticated crate downloads (PR), I came across a small sticking point: it is currently (relatively) difficult to pass a registry auth token to cargo at runtime for commands other than
publish
.As you may know,
cargo publish
provides a--token
option for passing an auth token at runtime. Previously, this was the only subcommand where that made sense. Now that we are introducing authenticated crate downloads, the existing methods of providing an auth token at runtime for other commands could use improvement.cargo login
and credentials stored on local file system works great for terminal usage by a person, including for non-publish
subcommands. But for containers, CI, sandboxed build environments, etc. there are often situations where you don't want to leave an auth token stored in a local file, which presents the need to pass the token to cargo at runtime.My interest in this is, for example, for the purpose of auto-generating cargo doc output for crate versions published at a private registry that requires authentication, and wanting to avoid storing the registry auth token in a file (i.e. ~/.cargo/credentials) in the build environment.
The only option I have found for passing the token for a private registry for
cargo fetch
, etc. is theCARGO_REGISTRIES_<name>_TOKEN
env var, which I was able to get to work.However, it took some effort to discover how to format the in screaming snake case would be converted and matched (i.e. CARGO_REGISTRIES_SHIPYARD_RS_TOKEN=xxx works).
<name>
portion of the env var. In my case, I was testing using a private registry configured under the name "shipyard-rs". The hyphen/dash in the name made things difficult. My first attempt was CARGO_REGISTRIES_shipyard-rs_TOKEN=xxx and that did not work, as variable names with dashes don't work in shell. I tried using the env program (env "CARGO_REGISTRIES_shipyard-rs_TOKEN=xxx" cargo ... before realizing that formatting the registryI may have missed where it is, but when I was trying this out I didn't find any documentation about the formatting of the
<name>
portion of the env var.For
cargo fetch
et al., there is no--token
command line option, so that can't be used.I was also unable to use --config to work around this, i.e.:
failed with
error: registries.shipyard-rs.token cannot be set through --config for security reasons
(is there some other way to pass this which would not trigger this error?)To be clear, this situation was never a problem when publish was the only subcommand that needed auth tokens. But now that we are moving forward with fully authenticated cargo, I think the status quo on passing a token at runtime is not at the high level I would expect from Cargo.
Proposed Solution
--token
flag/option for subcommands other thanpublish
--config
for settingregistries.<name>.token
CARGO_REGISTRIES_<name>_TOKEN
to avoid possible confusionNotes
This issue was initially raised in the issue for the RFC implementation PR, but @Eh2406 suggested creating a separate issue on the matter.