pytroll / donfig

Python library for configuring a package including defaults, env variable loading, and yaml loading.
https://donfig.readthedocs.io/en/latest/
MIT License
37 stars 10 forks source link

Security concern around literal_eval() of all discovered env vars #113

Open applio opened 4 days ago

applio commented 4 days ago

In donfig.config_obj.collect_yaml(), currently ast.literal_eval() is called on every discovered environment variable's value to provide users with the ability to dynamically set values through code. While this feature can be quite valuable in certain use cases, it does raise some security concerns.

One hypothetical but specific example of how this can become a security issue: because by design, environment variables are discovered by Donfig at runtime, a malicious actor could set an additional environment variable (using Dask as an example, a novel env var named DASK_NOBODY_EXPECTS) in an unsuspecting user's shell and if the value of that env var is a string containing a valid Python expression, it would be run by the unsuspecting user with that user's privileges.

It would be preferable to have the ability to disable or enable the attempted use of ast.literal_eval() on each and every discovered env var. It would also be preferable to have this dynamic-eval feature disabled by default and only enabled explicitly when its value outweighs any associated security concerns. Enabling the feature should only be possible programmatically and not via an environment variable.

If others agree that this is a change worth implementing, it could be as simple as exposing a new input argument on donfig.config_obj.Config.__init__() and donfig.config_obj.collect_yaml() which enables/disables the use of ast.literal_eval() and defaults to False.

djhoese commented 4 days ago

Thank you for bringing this up. It's a relief to get a security issue that isn't auto/AI generated and that includes examples/suggestions.

I'm torn about this being a donfig issue directly or an issue at all. I will admit that I am under reduced sleep due to an 8 month old baby so feel free to tell me I'm missing something obvious. Some things that come to mind:

  1. Have you brought this up with the dask project which this project was originally based on? If so, how do they handle this?
  2. literal_eval is by design not meant to run arbitrary python code, but only a limited set. See the documentation here https://docs.python.org/3/library/ast.html#ast.literal_eval. This documentation mentions memory or resource exhaustion but not arbitrary code execution.
  3. You mentioned a malicious actor adding this bad environment variable, but if the hacker has the ability to do this then they likely have the ability to do much more. Additionally, the new environment variable would not be able to "infect" an already running process unless I'm mistaken since that environment is also established.
  4. If something outside of donfig is allowing users to specify environment variables (versus safe YAML) in a system not running in a sandbox and user-restricted environment (non-root with very limited permissions) then I'd say it is the fault of that system-outside-donfig, not donfig.

Possible changes to address this: