Security concern around literal_eval() of all discovered env vars

In donfig.config_obj.collect_yaml(), currently ast.literal_eval() is called on every discovered environment variable's value to provide users with the ability to dynamically set values through code. While this feature can be quite valuable in certain use cases, it does raise some security concerns.

One hypothetical but specific example of how this can become a security issue: because by design, environment variables are discovered by Donfig at runtime, a malicious actor could set an additional environment variable (using Dask as an example, a novel env var named DASK_NOBODY_EXPECTS) in an unsuspecting user's shell and if the value of that env var is a string containing a valid Python expression, it would be run by the unsuspecting user with that user's privileges.

It would be preferable to have the ability to disable or enable the attempted use of ast.literal_eval() on each and every discovered env var. It would also be preferable to have this dynamic-eval feature disabled by default and only enabled explicitly when its value outweighs any associated security concerns. Enabling the feature should only be possible programmatically and not via an environment variable.

If others agree that this is a change worth implementing, it could be as simple as exposing a new input argument on donfig.config_obj.Config.__init__() and donfig.config_obj.collect_yaml() which enables/disables the use of ast.literal_eval() and defaults to False.

Thank you for bringing this up. It's a relief to get a security issue that isn't auto/AI generated and that includes examples/suggestions.

I'm torn about this being a donfig issue directly or an issue at all. I will admit that I am under reduced sleep due to an 8 month old baby so feel free to tell me I'm missing something obvious. Some things that come to mind:

Have you brought this up with the dask project which this project was originally based on? If so, how do they handle this?
literal_eval is by design not meant to run arbitrary python code, but only a limited set. See the documentation here https://docs.python.org/3/library/ast.html#ast.literal_eval. This documentation mentions memory or resource exhaustion but not arbitrary code execution.
You mentioned a malicious actor adding this bad environment variable, but if the hacker has the ability to do this then they likely have the ability to do much more. Additionally, the new environment variable would not be able to "infect" an already running process unless I'm mistaken since that environment is also established.
If something outside of donfig is allowing users to specify environment variables (versus safe YAML) in a system not running in a sandbox and user-restricted environment (non-root with very limited permissions) then I'd say it is the fault of that system-outside-donfig, not donfig.

Possible changes to address this:

Nothing.
There have been talks in the past about schema validation and other such things. It would not be a stretch to limit the config object to a specific set of keys and only those get checked instead of every env var and they then get validated to the appropriate type of object. Still literal_eval should already be limiting this.
Adopt/mimic pyyaml's approach of an Unsafe and a Safe loader/Config object.
Your suggestion for an input argument to __init__ or my Unsafe/Safe loader does not change that it isn't up to the user of a library to control this behavior, but it is controlled by the library developer who is using donfig. So this does not protect users.
Your suggestion actually is possible I think by passing env={} to the Config object.

pytroll / donfig

Security concern around literal_eval() of all discovered env vars #113