pydantic / pydantic-settings

Settings management using pydantic
https://docs.pydantic.dev/latest/usage/pydantic_settings/
MIT License
502 stars 47 forks source link

Can't read Pydantic Settings from stdin #296

Closed WarpedPixel closed 6 days ago

WarpedPixel commented 1 month ago

Requirement

I need to read Settings from stdin. This is not our primary scenario but something we need for tooling, for security reasons (secrets we don't want on disk). Our tooling reads settings to process them, and ideally they should be piped to our app to leave no traces behind, even temporarily.

What we tried

Settings pass filenames only into the public API. And using the stdin filename '/dev/stdin' doesn't work (on Mac) as the reads don't block until data is available (it simply returns an empty file).

Because we couldn't find a clean way to do this, I think it is an issue that should be addressed. Perhaps there is a different way to do it that we missed, please let me know.

Possible Fixes

The best fix would be to simply support streams everywhere in the API, instead of just filenames.

It would also be fine to support special filenames '-' or '/dev/stdin' as special cases that are internally replaced by stream=sys.stdin. Or to allow a SettingsSource to provide a stream instead of file names.

In our case we hacked a third solution deriving a class from DotEnvSettingsSource because we need most of that functionality (just need to replace the source in some cases). But because read_env_file is a global function instead of a method, we cannot simply override that and had to copy a few other methods from the implementation. Moving that into the DotEnvSettingsSource class would allow a clean override of just read_env_file that could handle stdin reading. This is probably the minimal solution that cleanly resolves this requirement, without changing anything for the mainstream usage of Pydantic settings.

hramezani commented 1 month ago

Thanks @WarpedPixel for reporting this.

RIght now, pydanantic-settings doesn't have a settings source that can read values from a stream.

In our case we hacked a third solution deriving a class from DotEnvSettingsSource because we need most of that functionality (just need to replace the source in some cases). But because read_env_file is a global function instead of a method, we cannot simply override that and had to copy a few other methods from the implementation. Moving that into the DotEnvSettingsSource class would allow a clean override of just read_env_file that could handle stdin reading. This is probably the minimal solution that cleanly resolves this requirement, without changing anything for the mainstream usage of Pydantic settings.

Yeah, it is a function and we can't move it to DotEnvSettingsSource because it will introduce a breaking change and we don't want to break people's code. You can override DotEnvSettingsSource._read_env_files, this method uses read_env_file to load the file and updates dotenv_vars dict.

WarpedPixel commented 3 weeks ago

Perhaps there is a better way to do this, but it turned into 75 lines of mostly duplicated code that will certainly be fragile and difficult to maintain. The below is hacky because a class could be written to handle both filenames and streams (or named streams at least) without changing the signatures of anything, but it passes my tests.

Code to override read_env_file

```python # this runs against the current version of pydantic_settings import os import sys from pathlib import Path from typing import Mapping from dotenv import dotenv_values from pydantic_settings import BaseSettings, DotEnvSettingsSource from pydantic_settings.sources import ENV_FILE_SENTINEL, DotenvType, parse_env_vars from pydantic_settings.sources import read_env_file as original_read_env_file class StdinSettingsSource(DotEnvSettingsSource): def __init__( self, settings_cls: type[BaseSettings], env_file: DotenvType | None = ENV_FILE_SENTINEL, env_file_encoding: str | None = None, case_sensitive: bool | None = None, env_prefix: str | None = None, env_nested_delimiter: str | None = None, env_ignore_empty: bool | None = None, env_parse_none_str: str | None = None, ) -> None: super().__init__( settings_cls, env_file, env_file_encoding, case_sensitive, env_prefix, env_nested_delimiter, env_ignore_empty, env_parse_none_str ) @staticmethod def read_env_file( file_path: Path, *, encoding: str | None = None, case_sensitive: bool = False, ignore_empty: bool = False, parse_none_str: str | None = None, ) -> Mapping[str, str | None]: file_vars: dict[str, str | None] = dotenv_values(stream=sys.stdin, encoding=encoding or 'utf8') return parse_env_vars(file_vars, case_sensitive, ignore_empty, parse_none_str) # copy from pydantic_settings.sources.DotEnvSettingsSource. NEEDED because we can't override read_env_file def _read_env_files(self) -> Mapping[str, str | None]: env_files = ['/dev/stdin'] # env_files = self.env_file # if env_files is None: # return {} if isinstance(env_files, (str, os.PathLike)): env_files = [env_files] dotenv_vars: dict[str, str | None] = {} for env_file in env_files: env_path = Path(env_file).expanduser() if env_path.is_file(): dotenv_vars.update( original_read_env_file( env_path, encoding=self.env_file_encoding, case_sensitive=self.case_sensitive, ignore_empty=self.env_ignore_empty, parse_none_str=self.env_parse_none_str, ) ) elif env_path == Path('/dev/stdin'): dotenv_vars.update( self.read_env_file( env_path, encoding=self.env_file_encoding, case_sensitive=self.case_sensitive, ignore_empty=self.env_ignore_empty, parse_none_str=self.env_parse_none_str, ) ) return dotenv_vars ```

It seems like moving read_env_file to a method shouldn't break anything. It is not referenced anywhere else in pydantic_settings. And if we really want to handle the case where people import that function you could continue to expose it with a deprecation warning. Something like

import warnings

def read_env_file(
    file_path: Path,
    *,
    encoding: str | None = None,
    case_sensitive: bool = False,
    ignore_empty: bool = False,
    parse_none_str: str | None = None,
) -> Mapping[str, str | None]:
    warnings.warn("read_env_file is deprecated and will be removed in the next version, use DotEnvSettingsSource.read_env_file instead", DeprecationWarning)
    file_vars: dict[str, str | None] = dotenv_values(file_path, encoding=encoding or 'utf8')
    return parse_env_vars(file_vars, case_sensitive, ignore_empty, parse_none_str)

This would allow the subclassing of DotEnvSettingsSource to become something much simpler:

# this version would work if we made the change to bring read_env_file as a method of DotEnvSettingsSource
class SimpleStdinSettingsSource(DotEnvSettingsSource):
    # override read_env_file to also read from stdin
    def read_env_file(
        self,
        file_path: Path,
    ) -> Mapping[str, str | None]:
        if file_path == Path('/dev/stdin'):
            file_vars: dict[str, str | None] = dotenv_values(stream=sys.stdin, encoding=self.env_file_encoding or 'utf8')
        else:
            file_vars: dict[str, str | None] = dotenv_values(file_path, encoding=self.env_file_encoding or 'utf8')
        return parse_env_vars(file_vars, self.case_sensitive, self.env_ignore_empty, self.env_parse_none_str)

If I can be of help I would work on a clean PR for this.

hramezani commented 2 weeks ago

@WarpedPixel let's move it into the DotEnvSettingsSource and add a deprecation warning