lpsmith / configurator-ng

A Haskell library supporting flexible, dynamic file-based configuration.
Other
16 stars 7 forks source link

configurator-ng

What is this?

This is a massively breaking revision of the application interface of configurator. The configuration file syntax is backward compatible, and mostly forward compatible as well. This fork is not (yet?) intended for widespread public consumption. Rather, this repo is being used as a stopgap measure in some of my own projects as well as a playground and laboratory for a new configurator-like package that may be released sometime in the future.

Manifesto

(Note that this section is mildly aspirational at a few points, and/or contains errors.)

The application interface of configurator has numerous problems:

The aim of configurator-ng is to improve these issues, with the initial efforts focused on the first three. I hope to make more correct solutions easier, and less correct solutions harder, all wrapped up in a more expressive interface.

Race conditions

The interface of configurator basically is:

data Config = Config (IORef (HashMap Text Value))

lookup :: Configured a => Config -> Text -> IO (Maybe a)

The IORef is there to support configuration file reloading, which is often done automatically. So this results in the race condition:

do
  key0 <- lookup config "key0"
  reload config  {- in another thread -}
  key1 <- lookup config "key1"
  return (key0, key1)

Thus, we have taken key0 and key1 from two versions of the configuration files, with a overall result that is not necessarily consistent with either version.

There is a way to solve this race condition*, though it is by no means convenient and it provides even less support for turning the result into configuration parameters:

getMap :: Config -> IO (HashMap Text Value)

This obtains a consistent* snapshot of the configuration, from which you can pull out multiple values. But in addition to being less obvious and inconvenient, the fact that the HashMap returned is not an abstract type makes means that changing the representation breaks client code that uses this approach.

configurator-ng makes the latter mode of use much more convenient by introducing ConfigParsers, a applicative/monadic high-level parsing interface to read configuration info from a single snapshot. See the module Data.Configurator.Parser. The basic ideas behind the revised interface is as follows:

data ConfigCache = ConfigCache (IORef Config)

readConfig :: ConfigCache -> IO Config

runParser :: ConfigParser m => m a -> Config -> (Maybe a, [ConfigError])

(Here, ConfigError could be an error condition, or it might be more analogous to a warning or informational message; thus a parser can return a result and some ConfigErrors.)

Finally, we could define a ConfigParser to read from key1 and key2 by writing:

getKeys :: ConfigParser m => m (Text, Int)
getKeys = (,) <$> key "key0" <*> key "key1"

(*It's important to point out that getMap only avoids introducing additional race conditions; commonly used filesystems are racey software artifacts, so this is only consistent relative to filesystem reads. For a complete solution, one would have to take care in the precise filesystem calls used to manipulate the configuration file(s). Most popular text editors should be ok as far as the consistency of a single file, consistent reads of multiple files is trickier.)

Configuration validation

Another advantage of the ConfigParser interface is that it makes it easier and more convenient to validate a (sub-)configuration as an entirety, and thus also make more intelligent decisions about what to do in cases of misconfigurations. For example, one might want to continue running on the last known good configuration, and raise a big red flag in a monitoring solution. The goal is to provide mechanism, not policy.

Greater Expressive Power

Consider the following use case: you have an event processor, that watches several named sources for events. You might like your configuration file to look something like this:

event-sources {
    amazon-cloud {
        postgres {
            host    = "cloudevents.mydomain.com"
            port    = 5433
            dbname  = "eventdb"
            sslmode = "verify-full"
            sslcert = "${HOME}/credentials/pgclient.crt"
            sslkey  = "${HOME}/credentials/pgclient.key"
        }
        heartbeat-interval = 15
        heartbeat-timeout  = 15
    }
    chicago-service-center {
        postgres {
            host    = "pgevents.customerdomain.com"
            port    = 5433
            dbname  = "eventdb"
            sslmode = "verify-full"
            sslcert = "${HOME}/credentials/pgclient.crt"
            sslkey  = "${HOME}/credentials/pgclient.key"
        }
        heartbeat-interval = 15
        heartbeat-timeout  = 15
    }
}

Now, amazon-cloud and chicago-service-center are names of the source useful for whatever purposes (logging, API endpoints, etc), that the event processor doesn't know about in advance. Since configurator is tied down to HashMap, the data structure offers no support for efficiently discovering these names. In order to fix this, configurator-ng moved to critbit. which allows us to efficiently iterate over these keys (in alphabetical order). So configurator-ng offers the following operator:

subgroups :: ConfigParser m => Text -> m [Text]

subgroups returns the non-empty value groupings of it's argument, so for example when evaluated in the context of the configuration above:

subgroups ""              ==> [ "event-sources" ]

subgroups "event-sources" ==> [ "event-sources.amazon-cloud"
                              , "event-sources.chicago-service-center" ]

Another issue is that there's a lot of redundancy here, so maybe we'd like to refactor the configuration file into something like this:

event-sources {
    amazon-cloud {
        postgres.host = "cloudevents.mydomain.com"
    }
    chicago-service-center {
        postgres.host = "pgevents.customerdomain.com"
    }
    default {
        postgres {
            port    = 5433
            dbname  = "eventdb"
            sslmode = "verify-full"
            sslcert = "${HOME}/credentials/pgclient.crt"
            sslkey  = "${HOME}/credentials/pgclient.key"
        }
        heartbeat-interval = 15
        heartbeat-timeout  = 15
    }
}

So now the problem is that we want to turn this configuration into a list of EventSources:

data EventSource = EventSource {
    name              :: !Text,
    libpqConnParams   :: [(Text,Value)],
    heartbeatInterval :: !Micro,
    heartbeatTimeout  :: !Micro,
  }

Now, even ignoring the issue of the names mentioned above, handling this sort of customizable defaulting in configurator would be rather painful. But it's actually quite easy with configurator-ng:

{-# LANGUAGE ApplicativeDo, RecordWildCards #-}

mapA :: Applicative f => (a -> f b) -> [a] -> f [b]
mapA f = foldr (liftA2 (:)) (pure []) . map f

eventSources :: ConfigParserA [EventSource]
eventSources = do
    localConfig (subconfig "event-sources") $ do
        mapA eventSource . filter (/= "default") <$> subgroups ""

eventSource :: Text -> ConfigParserA EventSource
eventSource name = do
    localConfig (union (subconfig name     )
                       (subconfig "default")) $ do
        libpqConnParams   <- localConfig (subconfig "postgres") (subassocs "")
        heartbeatInterval <- key "heartbeat-interval"
        heartbeatTimeout  <- key "heartbeat-timeout"
        pure $! EventSource{..}

This example uses the ConfigParserA variant of ConfigParser, so that the parser continues to run after encountering an error in order to generate more error messages. It also uses localConfig operator to run a subparser in a different configuration context. There are a few operators for modifying the configuration context:

localConfig :: ConfigParser m => ConfigTransform -> m a -> m a

data ConfigTransform  -- Conceptually, type ConfigTransform = Config -> Config

instance Monoid ConfigTransform
   -- mempty  is identity transformation
   -- mappend is composition of transformations

-- | Left-biased union of two configurations
union :: ConfigTransform -> ConfigTransform -> ConfigTransform

-- | Restrict a configuration to a given group,  and remove that group
--   prefix from all key names.
subconfig :: Text -> ConfigTransform

-- | Add a group name as a prefix to all key names
superconfig :: Text -> ConfigTransform

Note that these operators are implemented "symbolically", so that they run in sub-linear (Possibly O(1)?) time. Instead, the cost of these are paid on each (key,value) lookup.

Syntactic extensions

Datum comments have been implemented, not unlike Scheme and Clojure. The configurator-ng parser will ignore any binding preceded by a #; token; the binding following #; must be begin on the same line, and must be syntactically correct, but will otherwise be ignored.

This is a significant convenience for use cases like the event source example above: for example one could disable chicago-service-center by putting #; before the name. One can also use this as a slightly restricted means of block comments, by writing #; comment { (the name doesn't matter) to begin the block comment, and a matching } to end the comment. Of course, the intervening bindings must be syntactically correct, so this isn't an exact substitute for block comments.

Also, configurator-ng also allows group names to be inlined into other group and key names, separated by a dot character. For example, these configuration snippets are all equivalent:

foo {
  bar {
    x = "Hello"
    y = "World"
  }
}

foo.bar {
  x = "Hello"
  y = "World"
}

foo {
  bar.x = "Hello"
  bar.y = "World"
}

foo.bar.x = "Hello"
foo.bar.y = "World"

With the original configurator, only the first snippet is syntactically legal.

Finally, configurator-ng supports scientific notation for numerical values, via the scientific package, which corresponds closely to typical floating point syntax.

Configuration Change Subscriptions

Configurator's change notification system is also painful to use except in the most trivial of cases, not least because the callback is called for a single changed (key,value) pair at a time. Determining how that impacts a given configuration record (like EventSource above) is up to the user.

Soon, configurator-ng will offer something along the lines of the following function:

subscribe :: ConfigParser m => ConfigCache -> m a -> (a -> IO ()) -> IO ()

When the configuration files are reloaded, every subscribed ConfigParser is rerun, and the result is passed on to the callback. Now, of course, many callbacks won't want to be called unless their configuration actually changes. However, this is actually a reasonable thing to punt to the callback, because we can write a generic callback wrapper to handle this issue:

debounce :: (a -> a -> Bool) -> (a -> IO ()) -> IO (a -> IO ())
debounce notEq callback = do
    last_seen <- newIORef Nothing
    return $ \new -> do
        m_old <- readIORef last_seen
        if   case m_old of
               Nothing  -> True
               Just old -> notEq old new
        then do
          writeIORef last_seen (Just new)
          callback new
        else do
          return ()

Optimizing subscribe

It would be more efficient to run only those ConfigParsers that have the possibility of changing. If we design the configurator-ng interface carefully, we can determine all the keys that a parser depends on. We can then use this information to rerun only those parsers whose result might possibly change. (Though, debounce could still be useful, as ConfigParsers aren't guaranteed to be 1-1 functions.)

However, once we have dependency tracking that works, there are further applications this could enable, such as: