hackforla / HomeUniteUs

We're working with community non-profits who have a Host Home or empty bedrooms initiative to develop a workflow management tool to make the process scalable (across all providers), reduce institutional bias, and effectively capture data.
https://homeunite.us/
GNU General Public License v2.0
38 stars 21 forks source link

Create Config System For Configs at Runtime #530

Closed ju1es closed 10 months ago

ju1es commented 1 year ago

Overview

Create config system so that configurations are used dynamically in flask app during runtime, depending on the user and environment.

Action Items

paulespinosa commented 1 year ago

@ju1es Can you add background information and what the implementation would help to achieve, please?

Joshua-Douglas commented 11 months ago

I started working on #576, but I realized that the proper fix for #576 is to add a runtime configuration system. I'm going to move the research for that issue here since it is more relevant.

Joshua-Douglas commented 11 months ago

Overview

Introduce a runtime configuration system, to allow us to choose the configuration from the terminal.

Requirements

Research

What is the exact error we are encountering?

If you remove the AWS Congnito secrets from your .env file and run the tests locally an error is thrown during the import of auth_controller.py within the call to boto3.client.

openapi_server\controllers\users_controller.py:5: in <module>
    from openapi_server.controllers.auth_controller import get_token_auth_header
openapi_server\controllers\auth_controller.py:37: in <module>
    userClient = boto3.client('cognito-idp', region_name=COGNITO_REGION, aws_access_key_id = COGNITO_ACCESS_ID, aws_secret_access_key = COGNITO_ACCESS_KEY)

Our test cases do not currently use authentication, but the authentication py files do get imported. Importing auth_controller.py currently requires valid credentials to work - even if you do not use any of the imported functionality.

How do we currently configure the API app?

Since we are using connexion we have two application classes - a connexion.apps.flask_app.FlaskApp and a flask.Flask app.

The connexion FlaskApp is a wrapper around the Flask application class. Each class has their own specific configuration options.

We currently configure both application classes using this factory method within opeanapi_server.__main__.py.

def create_app():
    '''
    Creates a configured application that is ready to be run.

    The application is configured from external environment variables stored
    in either a local `.env` in a development environment or from environment
    variables that are configured prior to running the application.

    This function is made available at the module level so that it can
    be loaded by a WSGI server in a production-like environment with
    `openapi_server.__main__:create_app()`.
    '''

    # `connexion.App` is aliased to `connexion.FlaskApp` (as of connexion v2.13.1)
    # which is the connexion layer built on top of Flask. So we think of it as
    # the connexion application.
    connexion_app = connexion.App(__name__)
    api_spec_path = connexion_app.get_root_path() / Path('./openapi/openapi.yaml')
    parsed_specs = get_bundled_specs(api_spec_path)

    parsed_specs['info']['version'] = get_version()

    connexion_app.add_api(parsed_specs, pythonic_params=True)
    connexion_app.add_error_handler(AuthError, handle_auth_error)

    ENV_FILE = find_dotenv()
    if ENV_FILE:
        load_dotenv(ENV_FILE)
    SECRET_KEY = env.get("SECRET_KEY")
    env_config_profile = get_key(find_dotenv(), "CONFIG_PROFILE")

    # The underlying instance of Flask is stored in `connexion_app.app`.
    # This is an instance of `flask.Flask`.
    flask_app = connexion_app.app
    flask_app.json_encoder = encoder.JSONEncoder
    flask_app.secret_key = SECRET_KEY

    # Below, the Flask configuration handler is loaded with the
    # application configuration settings
    flask_app.config.from_object(compile_config(env_config_profile))

    return connexion_app

The FlaskApp and Flask applications are configured separately. Much of the configuration is hard coded using these statements, but we do have some infrastructure to allow additional configuration parameters to be specified using a .env file.

The load_dotenv method loads all of the variables available within the .env file as environment variables. After this call, all of the key-value pairs available within the .env file are available using os.environ("Key") (Note: by default load_dotenv will not override existing environment variable values).

We use some of these env variables to explicitly set some configuration properties of our flask app, and we use a custom configuration system to set the rest - with custom_config.

How does our custom configuration system work?

Our custom configuration system's entrypoint is api.configs.configs.py::compile_configs.. This method returns a configuration class that is passed to FlaskApp.configuration.from_object. The from_object method will iterate over the uppercase attributes on the object and add them to the configuration as key-value pairs.

The flask docs provide a warning with the from_object method:

You should not use this function to load the actual configuration but rather configuration defaults. The actual config should be loaded with from_pyfile() and ideally from a location not within the package because the package might be installed system wide.

class Config():
    HOST = '0.0.0.0'
    PORT = 8080
    DEBUG = True
    USE_RELOADER = True

def compile_config(profile: str, mod: str = 'configs.personal', clazz: str = 'PersonalConfig') -> Config:
    config = Config()

    if profile == 'personal':
        try:
            personal_mod = importlib.import_module(mod)
            personal_config = getattr(personal_mod, clazz)
            config = personal_config
        except ModuleNotFoundError:
            raise

    # handle other profiles here

    return config

This method will return the default to the config if an unrecognized profile is provided.

To use the personal profile you need to do the following:

1) Specify the profile within your .env file (CONFIG_PROFILE=personal) 2) Add a python file to the configs package, with a name matching the mod param 3) Within your new python file, add a class with a name matching the clazz param 4) If mod != configs.personal.py or clazz != PersonalConfig then you will need to update the implementation of the existing create_app method to use the custom configuration (or add a new factory method)

Can you set all Flask configuration values at runtime?

Yes, but there are some restrictions that we need to keep in mind. The ENV and DEBUG configuration variables are used by the Flask app during initialization. Flask reads these special variables from the environment - so we need to ensure that these variables are loaded into the environment before we create the instance of our Flask app.

Here is the except explaining the ENV var restriction:

"What environment the app is running in. Flask and extensions may enable behaviors based on the environment, such as enabling debug mode. The env attribute maps to this config key. This is set by the FLASK_ENV environment variable and may not behave as expected if set in code."

It is not safe to set in code, with one caveat - you can load environment variables in code before the Flask app is created.

How do we configure the API database

The database is configured at just before the apps are created during the module import of openapi_server.__main__.py.

If a DATABASE_URL env variable is specified then the DataAccessLayer will use it to create a SQLAlchemy database engine. Otherwise, the default value of "sqlite:///./homeuniteus.db" is used. (Note: You cannot currently specify the DATABASE_URL within the .env file since the database is initialized before the factory method's call to load_dotenv).

#__main__.py
from openapi_server.models.database import DataAccessLayer
DataAccessLayer.db_init()
#database.py
DATABASE_URL = env.get('DATABASE_URL')
class DataAccessLayer:
    _engine: Engine = None

    # temporary local sqlite DB, replace with conn str for postgres container port for real e2e
    _conn_string: str = DATABASE_URL if DATABASE_URL else "sqlite:///./homeuniteus.db"

    @classmethod
    def db_init(cls, conn_string=None):
        Base.metadata.create_all(bind=cls.get_engine(conn_string))

    # other code

    @classmethod
    def get_engine(cls, conn_string=None) -> Engine:
        if cls._engine == None:
            cls._engine = create_engine(conn_string or cls._conn_string, echo=True, future=True)
        return cls._engine

Should we configure the Flask DATABASE option?

Flask provides a DATABASE configuration option that provides built-in support for SQLite databases. We are using SQLAlchemy to manage our database connection and our production database uses PostgreSQL. As a result, the DATABASE option does not appear to apply to our use case.

The Flask docs have a SQLAlchemy guide, and the guide does not recommend configuring the DATABASE option.

What application configurations should we be able to support

These configurations will likely suite all of our future needs.

Production flask application

This the configuration for the real application that users will access, from homeunite.us. It should use a production grade WSGI server, and should have any and all mocking removed.

The production application configuration will require the strictest configuration validation, to ensure our deployed application is working as expected.

Staging flask application

The staging flask app is intended for internal use only, and will be hosted at dev.homeunite.us. The configuration should nearly mirror the production configuration. There may be some slight variations (such as a different flask SECRET_KEY to keep sessions distinct), but we will still need to apply strict configuration validation here.

Development flask application

The development flask app is intended to run on a developer's local machine. The development backend will need to work with the frontend app, and the developer should be able to add test data to user database without the risk of polluting the production API's data.

The development app currently runs with a non-production grade WSGI server that allows for easy debugging of the API code. Mocking within the development application is allowed, especially if the mocking limits the distribution of sensitive API service keys, minimizes the risk of passing external API call limits, or minimizes the risk of polluting the production database.

Ideally developers would be able to run a version of the application without any secrets. If this is done, then careful attention would be needed to ensure that all mocked services are stripped from the production application's configuration.

Development Testing flask application

The development testing application should be configured very similarly to the development application. More aggressive mocking can be used here to restrict the use of networking resources, to allow targeted unit tests instead of relying on I/O bound integration tests. I/O bound methods (e.g. network calls, database file storage, etc.) can make the clean setup & teardown of tests difficult to achieve, while also slowing down the entire test suite.

Our current testing application does not run any WSGI server at all. Instead we use the WSGI Test Client to simulate requests to the connection/flask applications. An in-memory database is also used, to prevent disk writes. There is currently no option to mock the authentictation system, so real API calls to the AWS Cognito service are required. This feature is expected to be added in issue #577.

Release Testing flask application

It could also be beneficial to include a testing application that strips all mocking, and performs tests using real databases and API network calls. These tests would verify that the application mocking is not hiding bugs covered by our test cases. Running this test suite would be slower and would risk hitting API limits, etc. We could compensate for this risk by running the test cases when release testing is required. For example, these tests could be configured to run as a post-build step during the staging environment deployment.

How does flask recommend configuring a complex project?

Flask has a configuration guide that has three strong recommendations that we will follow in this issue:

  1. Use factory method to create the app. The factory should accept a configuration parameter.
  2. Do not write code that needs the configuration at import time
  3. Load configuration very early on (before creating the application class)

We are currently violating the second rule. We do configure at import time, and this is breaking our test project.

The guide discusses the need for distinct development and production configurations. A common pattern is to store sensible default configurations in version control, and then use environment variables to swap between configuration types.

How can we fix our configuration system to follow Flask guidelines?

One option to remove import-time configuration and safely store secrets as environment variables is to encapsulate our configuration logic within Config classes, and create the appropriate config class on-demand within an application factory method. The base Config class will handle overriding configuration values with environment variable values, if present. Secrets will be specified as env vars and configuration variables will have the option of being hard-coded for clarify and convenience.

With the approach our app factory method will need a way of choosing the correct configuration. This should be configurable from the terminal, so the common solution recommended by flask is to use an environment variable to switch between the configuration types.

Our extended configuration class hierarchy will look like this. By structuring the configuration as a class hierarchy we can reuse the read_from_env() logic across all classes, and automatically validate the configuration values using class-specific logic defined within the protected validate method. Our validation will ensure that all required env variables are specified with reasonable values.

 classDiagram
    Config <|-- DevelopmentConfig
    Config <|-- ProductionConfig
    ProductionConfig <|-- StagingConfig
    Config : -read_from_env()
    Config : #validate()
    class DevelopmentConfig{
    }
    class ProductionConfig{
    }
    class StagingConfig {
    }

How should we specify our testing configuration?

Our testing configurations require more flexibility than 'real' configurations. A real configuration should specified once at startup, should be immutable, and should read values from the system environment. A test configuration should not read from the system environment, may benefit from mutability, and may be reused across multiple tests in a test fixture.

Since the requirements of our test configuration are different, we should not include test configurations as a Config class descendant. We can, instead, package these configurations as dictionaries or standalone classes. We just need to ensure our factory method can handle both configuration types.

List the current configuration options we use

Variable Description
COGNITO_CLIENT_ID Used for connecting to AWS Cognito
COGNITO_CLIENT_SECRET Used for connecting to AWS Cognito
COGNITO_REGION Used for connecting to AWS Cognito
COGNITO_REDIRECT_URI Used for connecting to AWS Cognito
COGNITO_USER_POOL_ID Used for connecting to AWS Cognito
COGNITO_ACCESS_ID Used for connecting to AWS Cognito
COGNITO_ACCESS_KEY Used for connecting to AWS Cognito
SECRET_KEY Flask specific configuration "A secret key that will be used for securely signing the session cookie and can be used for any other security related needs by extensions or your application. It should be a long random bytes or str."
ROOT_URL The front-end app's base URL used for redirecting clients after authentication.
DATABASE_URL The database's URL
FLASK_DEBUG Boolean indicating whether the flask app debug mode is set. Debug enables server reloads on code changes and an interactive debugger for unhandled exceptions.
FLASK_ENV Flask
TESTING Flask recommends setting this boolean to true for test cases. Exceptions are allowed to propagate instead of being handled by the app exception handler. This allows more errors to be found while testing
PORT The port number to run the development server
HOST Hostname to run development server on.
USE_RELOADER Boolean specifying whether to trigger server reloads when source code changes.

How do we run the deployed test dev API app?

The current deployment of the test dev API is outlined as follows:

/opt/dev.homeunite.us/dev-huu-env/bin/dotenv run \
/opt/dev.homeunite.us/dev-huu-env/bin/gunicorn --workers 1 --bind unix:dev.homeunite.us.sock -m 007 \
"openapi_server.__main__:create_app()"

Notice that the dotenv command is used to run gunicorn (the WSGI server). And, gunicorn runs the API. The dotenv command is used to turn the key-values inside of the .env file into process environment variables. This is done to make the DATABASE_URL variable that is configured inside of the .env file available to the API as an environment variable.

Can pytest mock environment variables?

Our configuration system will load environment variables at startup. Our debug testing configuration, however, should not depend on the system environment. Instead of reading from the real system environment, we should read from a mocked environment.

pytest includes a guide on monkeypatching environment variables, here. We can use the pytest monkeypatch functionality to temporarily specify the environment variables.

General Solution Plan

Complete the implementation of the configuration system. The updated system will follow the design pattern enforced by flask, and use the FLASK_ENV environment variable to select the application configuration.

Introduce all of the required configurations, with proper validation of the configuration options:

The updated configuration system will no longer utilize import-time configuration. We will also update the botoClient initialization code to lazy load the client, to enable test cases to pass when secrets are not available. Our new configurations will support mocking, but our codebase does not currently provide mocking implementations. As a result, we will throw NotImplementedExceptions when mocked services are accessed. For now, this means that attempting to signup or sign in a user with DevelopmentHUUConfig or DebugTestConfig will throw a temporary runtime exception. A follow-up issue will be created to add mocking.

Implementation Questions

Show the different non-testing configurations

Every one of these configuration properties can be overridden within the .env file, or by specifying an environment variable. All properties must be specified.

These configuration classes document the required properties, provide sensible defaults, and use validation to enforce our configuration requirements.

For example, if you try to launch production version of the application but forget to include the HUU_COGNITO_CLIENT_ID, then a runtime error would be thrown.

from dataclasses import dataclass, fields, asdict
from enum import Enum, auto
from typing import Type, TypeAlias

SecretStr: TypeAlias = str

@dataclass(frozen=True)
class HUUConfig:
    '''
    Define the configuration properties required by 
    all HUU application environments. 

    Each value can optionally be assigned as an 
    environment variable to allow configuration 
    from the terminal.
    '''
    FLASK_ENV: str
    FLASK_DEBUG: bool
    TESTING: bool
    SECRET_KEY: str
    ROOT_URL: str
    DATABASE_URL: str

    # Define convenience aliases for ENV and DEBUG.
    # These two configuration options are treated
    # specially by Flask. They must be loaded as 
    # environment variables before constructing the
    # Flask application instance in order to work properly

    @property 
    def ENV(self):
        return self.ENV

    @property 
    def DEBUG(self):
        return self.FLASK_DEBUG

    def __post_init__(self):
        '''
        Each time a configuration object is initialized, __post_init__
        will read the configuration options from the environment, 
        override the field values with the available environment values, 
        and validate the options using the pre_validate() and post_validate(). 
        '''
        self.pre_validate()

        for field in fields(self):
            env_value = os.environ.get(field.name)
            if env_value is not None:
                expected_type = type(getattr(self, field.name))
                cast_value = expected_type(env_value)
                object.__setattr__(self, field.name, cast_value)

        self.post_validate()

    def pre_validate(self):
        '''
        Validate the configuration options before they are loaded
        from the process environment variables.

        All fields marked with a SecretStr type must be loaded from
        the environment. Attempts to hard code these values will result
        in a ValueError here.
        '''
        for field in fields(self):
            if (field.type is SecretStr):
                value = getattr(self, field.name)
                if (value != ''):
                    raise ValueError("Secret fields cannot have hard-coded values. "
                                    "These must be loaded directly from an "
                                    "environment variable.")
                if (os.environ.get(field.name) is None):
                    raise ValueError(f"Configuration option {field.name} must "
                                 "be specified as an environment variable.")

    def post_validate(self):
        '''
        Validate the final configuration options, after overwriting
        the options using the process environment variables.
        '''
        pass 

class HUUAppEnv(Enum):
    DEVELOPMENT = auto()
    STAGING = auto()
    PRODUCTION = auto()

    @classmethod
    def available_environments(cls) -> str:
        return ",".join((env.name for env in cls))

    @classmethod
    def from_string(cls, raw: str) -> 'HUUAppEnv':
        try:
            return cls[raw.upper()]
        except KeyError:
            raise EnvironmentError(f"{raw} is not a valid environment. \
                                   Select one of the available options: \
                                   {cls.available_environments()}")

    @classmethod
    def load_config(cls, env: 'HUUAppEnv') -> Type[HUUConfig]:
        '''
        Return the configuration for the current environment. 
        Populated
        '''
        match env:
            case HUUAppEnv.DEVELOPMENT:
                return DevelopmentHUUConfig()
            case HUUAppEnv.STAGING:
                return StagingHUUConfig()
            case HUUAppEnv.PRODUCTION:
                return ProductionHUUConfig()
            case _:
                raise EnvironmentError(f"{env} does not have a registered "
                                       "configuration type. Please update the "
                                       "load_config method to register this new "
                                       "environment type.")

@dataclass(frozen=True)
class DevelopmentHUUConfig(HUUConfig):
    FLASK_ENV: str = 'development'
    FLASK_DEBUG: bool = True
    # We currently default to a publicly available 
    # server, but this has some security risks 
    # especially since we are not using https
    PORT: int = 8080
    HOST: str = "0.0.0.0"
    TESTING: bool = False
    USE_RELOADER: bool = True
    SECRET_KEY: str = "unsecurekey"
    ROOT_URL: str = "http://localhost:4040"
    DATABASE_URL: str = "sqlite:///./homeuniteus.db"

    def post_validate(self):
        super().post_validate()
        if (self.PORT < 0 or self.PORT > 65535):
            raise ValueError("Port must be in the range 0-65535.")

@dataclass(frozen=True)
class ProductionHUUConfig(HUUConfig):
    FLASK_ENV: str = 'production'
    FLASK_DEBUG: bool = False
    TESTING: bool = False 
    SECRET_KEY: SecretStr = ''
    ROOT_URL: SecretStr = ''
    DATABASE_URL: SecretStr = ''
    COGNITO_CLIENT_ID: SecretStr = ''
    COGNITO_CLIENT_SECRET: SecretStr = ''
    COGNITO_REGION: SecretStr = ''
    COGNITO_REDIRECT_URI: SecretStr = ''
    COGNITO_USER_POOL_ID: SecretStr = ''
    COGNITO_ACCESS_ID: SecretStr = ''
    COGNITO_ACCESS_KEY: SecretStr = ''

    def post_validate(self):
        super().post_validate()
        self.validate_secret_key(self.FLASK_SECRET_KEY)
        if (self.FLASK_DEBUG):
            raise ValueError("Debug mode is not supported by the current configuration")

    def validate_secret_key(self, key):
        """
        Validates a secret key by checking multiple conditions:
        1) Minimum 16 characters
        2) At least one uppercase letter
        3) At least one lowercase letter
        4) At least one digit
        """
        errors = []
        if len(key) < 16:
            errors.append("The key must be at least 16 characters long.")

        if not re.search("[a-z]", key):
            errors.append("The key must contain at least one lowercase letter.")

        if not re.search("[A-Z]", key):
            errors.append("The key must contain at least one uppercase letter.")

        if not re.search("[0-9]", key):
            errors.append("The key must contain at least one digit.")

        if len(errors) > 0:
            raise ValueError(f"Production secret key '{key}' is not strong enough. "
                             f"{''.join(errors)}")

@dataclass(frozen=True)
class StagingHUUConfig(ProductionHUUConfig):
    pass

Show how to register pytest configurations

We can use the pytest configuration system to select our testing environment using command-line arguments. This implementation introduces a "mode" pytest CLI arg that will be used to select either debug or release testing configurations.

## conftest.py
import pytest
from dataclasses import dataclass, asdict
from openapi_server.configs.configs import StagingHUUConfig, DevelopmentHUUConfig

@dataclass(frozen=True)
class ReleaseTestConfig(StagingHUUConfig):
    FLASK_ENV: str = 'release-test'
    TESTING: bool = True
    FLASK_DEBUG: bool = False
    DATABASE_URL='sqlite:///:memory:'

@dataclass(frozen=True)
class DebugTestConfig(DevelopmentHUUConfig):
    FLASK_ENV: str = 'debug-test'
    TESTING: bool = True
    FLASK_DEBUG: bool = True
    DATABASE_URL: str = 'sqlite:///:memory:'

    def __post_init__(self):
        with monkeypatch.context() as m:
            # The base config class reads the values from the
            # environment. Prevent this behavior for test cases
            # by monkeypatching the environment, and setting the
            # monkeypatched variables to the current value
            for field in fields(self):
                m.setattr(field.name, getattr(self, field.name))
            super().__post_init__()

def pytest_addoption(parser):
    parser.addoption(
        "--mode",
        action="store",
        default="debug",
        help="run tests in debug or release mode",
    )

def pytest_configure(config: pytest.Config):
    mode = config.getoption("mode", default='debug')
    if mode == 'debug':
        # All application configurations are defined explicitly in code. The 
        # system environment is not used. All resources that can be safely 
        # mocked, will be mocked (e.g. mock AWS cognito API calls)
        app_config = asdict(DebugTestConfig())
    elif mode == 'release':
        # Load configuration from the environment, to allow the use of 
        # secrets, and disable the mocking of any resources 
        from dotenv import load_dotenv, find_dotenv, get_key
        dot_env = find_dotenv()
        if dot_env:
            load_dotenv(dot_env)
        app_config = asdict(StagingTestConfig())
    else:
        raise KeyError(f"pytest application configuration mode {mode} not"
                   "recognized. Only debug and release modes supported.")

    config.app_config = app_config

Show how to create the test application

Our pytest application configuration is stored on the pytest Config singleton. This configuration object is accessible from pytest test fixtures, but our base application test class uses the unittest library.

To make pytest fixtures accessible from a unittest class you can use the pytest.mark.usefixtures decorator. To pass the pytest configuration to the unittest class we define a pass_app_config fixture that attaches the application configuration object to the unittest class.

This is an unfortunate amount of indirection, but it is required since we are mixing unittest and pytest. We could avoid if it we refactor BaseTestCase to use pure pytest, but the flask_testing.TestCase base class does a lot of complex test setup that is not worth reimplementing.

#test/conftest.py
@pytest.fixture(scope='class')
def pass_app_config(request):
    setattr(request.cls, 'app_config', request.config.app_config)
# test/__init__.py
@pytest.mark.usefixtures("pass_app_config")
class BaseTestCase(TestCase):

    def create_app(self):
        '''
        Create a instance of our Flask App, configured for testing purposes.

        The base class will never start the Flask App. It instead create a
        mock self.client class that is used to simulate requests to the WSGI server.

        https://flask.palletsprojects.com/en/2.2.x/testing/
        https://werkzeug.palletsprojects.com/en/2.3.x/test/
        '''
        self.provider_repo = HousingProviderRepository()

        logging.getLogger('connexion.operation').setLevel('ERROR')
        return create_app(self.app_config).app

Show how to register non-test environment-specific configurations

I'm choosing to opt for an explicit registration system, using the HUUAppEnv enum below. Each value in the enum maps to a configuration class, within the load_config method. All of the values within the configuration object can be set using environment variables.

class HUUAppEnv(Enum):
    DEVELOPMENT = auto()
    STAGING = auto()
    PRODUCTION = auto()

    @classmethod
    def available_environments(cls) -> str:
        return ",".join((env.name for env in cls))

    @classmethod
    def from_string(cls, raw: str) -> 'HUUAppEnv':
        try:
            return cls[raw.upper()]
        except KeyError:
            raise EnvironmentError(f"{raw} is not a valid environment. \
                                   Select one of the available options: \
                                   {cls.available_environments()}")

    @classmethod
    def load_config(cls, env: 'HUUAppEnv') -> Type[HUUConfig]:
        '''
        Return the configuration for the current environment. 
        Populated
        '''
        match env:
            case HUUAppEnv.DEVELOPMENT:
                return DevelopmentHUUConfig()
            case HUUAppEnv.STAGING:
                return StagingHUUConfig()
            case HUUAppEnv.PRODUCTION:
                return ProductionHUUConfig()
            case _:
                raise EnvironmentError(f"{env} does not have a registered "
                                       "configuration type. Please update the "
                                       "load_config method to register this new "
                                       "environment type.")

Show how to load the runtime environment configuration

Use the FLASK_ENV keyword to select the correct configuration. Return as a dictionary to allow for easy update of the Flask config.

I might move this into the create_app method.

def load_config_from_env() -> HUUConfig:
    env_file = find_dotenv()
    if env_file:
        # Optionally load environment variables from a .env file, and store
        # them as process environment variables (accessible from os.getenv)
        load_dotenv(env_file)

    app_environment = os.getenv("FLASK_ENV")
    if not app_environment:
        raise EnvironmentError("The FLASK_ENV variable or a test configuration must be provided. This variable "
                               "is used to select the application configuration "
                               "at runtime. Available options are "
                               f"{HUUAppEnv.available_environments()}")
    env_type = HUUAppEnv.from_string(app_environment)
    return HUUAppEnv.load_config(env_type)

Show how to lazy load a boto3 client

We can achieve this by adding an app class that uses the application configuration to instantiate the correct AWS identity provider client.

# huuapy.py 

class HUUApp(FlaskApp):
    def __init__(self, app_package_name: str, api_spec_rel_path: Path, *args, **kwargs):
        super().__init__(app_package_name, *args, **kwargs)
        self.app: Flask
        self._boto_client = None

        api_spec_path = self.get_root_path() / api_spec_rel_path
        parsed_specs = self._get_bundled_specs(api_spec_path)
        parsed_specs['info']['version'] = self.get_version()

        self.add_api(parsed_specs, pythonic_params=True)

    @property 
    def connexion_app(self) -> 'FlaskApp':
        return self

    @property 
    def flask_app(self) -> Flask:
        return self.app

    @property
    def config(self) -> Config:
        return self.flask_app.config

    @property 
    def environment(self) -> str:
        return self.app.config["ENV"]

    @property 
    def is_debug_app(self) -> bool:
        return self.app.config["DEBUG"]

    @property
    def boto_client(self):
        if self._boto_client:
            return self._boto_client

        if self.environment == HUUAppEnv.TEST_DEBUG:
            raise NotImplementedError("Debug mode test cases should not use a \
                                      real boto_client, but client mocking has \
                                      been implemented yet. This feature is planned \
                                      in Issue #577")
        import boto3
        self._boto_client = boto3.client('cognito-idp', 
                                         region_name=self.app.config["COGNITO_REGION"], 
                                         aws_access_key_id=self.app.config["COGNITO_ACCESS_ID"],
                                         aws_secret_access_key=self.app.config["COGNITO_ACCESS_KEY"]
                                         )
        return self._boto_client

    @staticmethod
    def _get_bundled_specs(spec_file: Path) -> Dict[str, Any]:
        '''
        Prance is able to resolve references to local *.yaml files.

        Use prance to parse the api specification document. Connexion's 
        default parser is not able to handle local file references, but
        our api specification is split across multiple files for readability.

        Args:
            main_file (Path): Path to a api specification .yaml file

        Returns:
            Dict[str, Any]: Parsed specification file, stored in a dict
        '''
        parser = prance.ResolvingParser(str(spec_file.absolute()), lazy=True, strict=True)
        parser.parse()

        return parser.specification

    @staticmethod
    def get_version():
        try:
            return version("homeuniteus-api")
        except PackageNotFoundError:
            # package is not installed
            return "0.1.0.dev0"

Show an app factory method that allows us to select configuration from the terminal

def create_app(test_config: HUUConfig = None):
    '''
    Creates a configured application that is ready to be run.

    The application is configured from external environment variables stored
    in either a local `.env` in a development environment or from environment
    variables that are configured prior to running the application.

    This function is made available at the module level so that it can
    be loaded by a WSGI server in a production-like environment with
    `openapi_server.__main__:create_app()`.
    '''
    if test_config:
        config = test_config
    else:
        config = load_config_from_env()

    app = HUUApp(__name__, './openapi/openapi.yaml')
    app.config.from_object(config)

    DataAccessLayer.db_init(app.config["DATABASE_URL"])        
    app.add_error_handler(AuthError, handle_auth_error)

    return app

Implementation Plan

  1. Implement each of the new configuration classes
  2. Implement HUUAppEnv
  3. Replace compile_config with load_config_from_env.
  4. Implement new HUU app class
  5. Replace create_app with our simpler implementation
  6. Remove all calls to env.get scattered throughout the codebase. Everyone should now use the application configuration settings
  7. Update all users of the boto client to use the application property. This will ensure the boto client is lazy loaded
  8. Introduce the 'release' and 'debug' pytest CLI args
  9. Use the new HUU app to simplify the test app setup
  10. Refactor DataAccessLayer to require a DATABASE_URL. Now that the configuration options are providing this, it should be a required parameter.
  11. Write test cases to verify:
    • Secrets can not be hard coded
    • The config objects are actually reading from the environment. Use env monkeypatching to achieve this
    • The config object validation works as expected
    • Attempting to load a production config object with a missing secret will throw an error
    • The env keyword successfully loads different config objects
  12. Enable the backend test cases on our github workflow scripts & verify test cases succeed on test runner
  13. Update READMEs to include instructions on how to use the configuration system.
Joshua-Douglas commented 11 months ago

Design document is complete and I'll start implementation tomorrow.

Plan is to have five separate configurations to cover all use cases I can think of. This configuration system is compatible with Flask's configuration system, and allows us to easily validate each option.