rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.79k stars 2.42k forks source link

Cargo doesn't uniformly read configuration from env vars #5416

Open alexcrichton opened 6 years ago

alexcrichton commented 6 years ago

Most configuration read through cargo uses helpers like Config::get_string which automatically read env vars, but configuration reading portions of Cargo that go through Config::get_table do not read environment variables. The return value here is a HashMap which doesn't proxy reads to an environment variable.

This affects, for example, source replacement. Source replacement is not configurable through environment variables (a bug) because of its use of get_table.

We should tackle this via one of two routes:

In any case, some questions to still work through here!

rtsuk commented 6 years ago

I'm going to take a whack at the "remove get_table" route.

rtsuk commented 6 years ago

After further discussion I'm not going to try this approach right now, instead musing about an alternate escaping of period and hyphen so that iteration over environmental variables could be unambiguous.

rtsuk commented 6 years ago

After pondering it for a few days I don't think there is such an alternate escaping. I think that it is not possible to support specifying an arbitrary TOML table via environmental variables. The space of key names in TOML is too rich to be encoded in the very limited character set allotted to environmental variables.

If one wanted to support overriding source replacement from environmental variables, one approach would be to use an array of tables where the property names can be defined to be unambiguous when converted to environmental variables. An individual table could have an index number in its name.

So instead of:

[source.crates-io]
replace-with = "vendored-sources"

[source.vendored-sources]
directory = "../third_party/rust-crates/vendor"

one would write:

[[sources]]
name = "crates-io"
replacement = "vendored-sources"

[[sources]]
name = "vendored-sources"
directory = "../third_party/rust-crates/vendor"

The environmental variable version of this would be:

CARGO_SOURCES_001_NAME = "crates-io"
CARGO_SOURCES_001_REPLACEMENT = "vendor-sources"
CARGO_SOURCES_002_NAME = "vendored-sources"
CARGO_SOURCES_002_DIRECTORY = "../third_party/rust-crates/vendor"

One would then implement get_table_array() and have it create an array of tables from environmental variables as described above, falling back to the config TOML in the same manner as other environmental variables.

alexcrichton commented 6 years ago

@rtsuk sounds plausible to me! In general I'd be totally on board with basically reworking how [source] configuration is specified so it's more amenable to reading it through environment variables.

Another possibility would be something like CARGO_SOURCES_ALL=comma,separated,list,of,names and that could inform Cargo which ones to read from env vars perhaps?

ehuss commented 6 years ago

Would it be possible for Config to iterate over the environment so that get_table can stay the same? For example, get_table("source") would fetch all CARGO_SOURCE_* environment variables, split the key by _, and inject all those values into the result.

ehuss commented 6 years ago

Nevermind - obviously it wouldn't know the correct type for each variable.

I'm liking the idea of making a serde Deserializer that can handle environment variables. I made one a few days ago, but it was more basic and ignored environment variables.

alexcrichton commented 6 years ago

Yeah I do think that we can transparently read env vars (given enough support) for specific keys, the problem for source replacement though is that we can't iterate over tables which are defined entirely in env vars because we can reconstruct a query for a specific key into an env var but we can't take an env var and say what key/table it corresponded to

matklad commented 6 years ago

I wonder if we need to provide a giant escape hatch like CARGO_CFG_TOML or CARGO_CFG_JSON, which just accepts a toml/json string with the contest of .cargo/config?

The benefit of the current setup is that a human can easily override config on the command line. However looks like to specify tables the syntax would need to be ugly and not human friendly anyway, so perhaps just providing a giant TOML escape hatch would work better for those who need it?

alexcrichton commented 6 years ago

Not a bad idea! TOML generally is dependent on newlines which doesn't jive well with env vars, but I'd be fine accepting either

ehuss commented 6 years ago

I have a prototype that auto-converts with serde here: https://github.com/ehuss/cargo/commit/63f76b0d24bc045809a7a0d4fceea6c9bd176390

Essentially you can do something like let p: TomlProfile = config.load_type("profile.dev") and it will auto-magically convert the config (including environment variables). It supports structs, arbitrary maps, optional values, etc.

If you'd like, I can continue working on this and submit a PR (just the serde part, no actual changes to Cargo, yet). I realize it is a lot of code, so maybe it's not worth it (I'm no serde expert, so some parts can probably be simpler). However, I think it would be helpful for the Profile work I'm doing, and could be used for the source map, and anything else in the future. Overall I'm not sure how useful environment variables are.

Things left to do:

Possible enhancements:

alexcrichton commented 6 years ago

Nice!

I think I'd personaly be in favor of going down such a route, it'd probably remove almost all of our methods elsewhere I think as well if we tried hard enough :)

For TOML-in-env I think we'll want to hold off on that for now, but I could imagine perhaps not the bare key but something like CARGO_PROFILE_DEV_TOML='...' and I actually really like the idea of using toml to specify arrays!

ehuss commented 6 years ago

PR for just the config part posted at #5552.

ehuss commented 6 years ago

@rtsuk Are you still working on this?

At least for source replacement, another option is to use the -Z advanced-env experimental option. It would allow you to specify source replacement like this:

CARGO_SOURCE_crates-io_REPLACE_WITH="another-source"
CARGO_SOURCE_another-source_DIRECTORY="vendor"

It only supports letters/numbers/dashes for the source name (no underscores). Also, the use of mixed upper/lowercase is a little unconventional and weird looking, but works in all environments I'm aware of.

Another option is to extend it to support inline tables like this:

CARGO_SOURCE="{crates-io={replace-with='another-source'}, another-source={directory='vendor'}}"

Just some alternate ideas.

rtsuk commented 6 years ago

@ehuss I am not. My personal need for it has passed and it looked very hairy. Should the need arise again, though, I'll check out the advanced-env option.

gilescope commented 4 years ago

Flattening all of toml to environment variables seems a big ask. We'd get a lot of utility if env vars in the config were expanded:

[source.vendored-sources]
directory = "$A_DIR/vendor"

(and %A_DIR% on windows)

Is that an option that could be considered?