crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.5k stars 1.62k forks source link

Proposal: improve stdlib logger #5874

Closed ezrast closed 4 years ago

ezrast commented 6 years ago

This is piggybacking off of some chat discussion by @RX14 and @straight-shoota earlier today.

Currently there is no great way for library authors to log without friction. A library can't just write to STDERR arbitrarily because that's a potentially unwanted side effect. It instead has to take an optional Logger? that is nil by default, which requires boilerplate and is unlikely to actually be used.

If we instead establish a class-based logging hierarchy into the stdlib, library code can log unconditionally, with downstream users selectively enabling logs from the libraries they care about. I propose improving the situation by making the following changes:

Structure loggers hierarchically, with granular level controls

A Logger should be able to take a parent, to whom the actual log writing is delegated. By default, a child Logger's level should be nil, which means it inherits the level of its parent.

main_logger = Logger.new(STDOUT)
child = Logger.new(main_logger)
noisy_child = Logger.new(main_logger)

main_logger.level = Logger::WARN
noisy_child.level = Logger::DEBUG

child.info{ "This logger uses its parent's log level and this message does not get printed" }
noisy_child.info{ "This message will get printed because noisy_child has its own log level set" }

Make logging class-aware by default

It should be trivial to define a logger that is associated with a particular class or module. That logger will inherit from the logger of its parent in the module namespace hierarchy. Parents can be created automagically.

module MyProgram
  class App
    include Loggable # creates a Logger and adds helper methods e.g. `warn`, `info`
    def initialize
      info{ "creating an app" }
    end
  end
end

# Sets the effective log level for every Loggable under the MyProgram namespace.
# Note that the Logger for MyProgram got created behind the scenes at the same time as the Logger for App
Loggable.get_logger(MyProgram).level = Logger::DEBUG

# Logs the following (note the class name is included): 
# I, [2018-03-26 14:43:17 -07:00 #25575] INFO -- MyProgram::App: creating an app
app = MyProgram::App.new

Loggable will have a root level logger that by default does not log anything, so library code will stay quiet. To up the verbosity, just do:

Loggable.get_root_logger.level = Logger::DEBUG  # Enables logging for every class and module in your program, including libraries.

Unrelated to the above points, Loggers currently only operate on IO objects. Modern applications frequently eschew traditional log files in favor of structured logging, so this behavior should be more customizeable. Thus, I also propose to

Abstract Logger away from IO

Instead of storing an IO directly, Loggers should store a LogWriter, which is a module with abstract def write on it. The standard library should include an IOLogWriter class that takes on the formatting behavior of the current Logger, but is interchangeable with custom-defined classes like so:

class GelfLogWriter
  include LogWriter

  def initialize(@transport, @host, @port: 12201)
  end

  def write(severity, datetime, progname, message)
    # build the document and send it over the network here...
  end
end

Loggable.get_root_logger.handler = (GelfLogWriter.new :udp, "graylog.example.com")

I'll be happy to work on implementing these changes if the core team thinks they'd be useful.

watzon commented 4 years ago

I love this! Fixes a lot of issues, and it's a pretty unique take on logging in a standard library.

bcardiff commented 4 years ago

Formatters will be something of each backend, but we don't discard some helper modules.

We are also iterating on some API to extend the log entry with Exception objects.

Thanks for the feedback.

straight-shoota commented 4 years ago

I like the proposed design :+1: My main concerns are about default behaviour and initialization, like how we can avoid having every crystal program try to read a crystal-log.yml file on startup. But such details can be sorted out when the base functionality is implemented.

Having Log instances as constants in Crystal namespaces sounds good. I figure you can still use local instances like log or @log, which would be handy when you want to separate log messages into different log namespaces even when the code lives in the same Crystal namespace.

Blacksmoke16 commented 4 years ago

Another question.

  def initialize(config : Log::Backend::Config)
  end

I'm assuming the Config object would be something similar to YAML::Any?

Ideally I think you could use https://github.com/crystal-lang/crystal/pull/8406 along with https://github.com/crystal-lang/crystal/pull/7648 to essentially build out the discriminator automatically; and also gain the ability to directly new up the backends.

willhbr commented 4 years ago

Thanks @bcardiff for sharing the new design! this definitely solves a lot of the shortcomings of the current logger.

Has there been any thought about whether some of this should live in a first-party shard, instead of directly in the standard library?

I personally think that the standard library should only include the core logging functionality (writing logs, an interface for outputting them, and an implementation of that interface for writing to stdio). Including a parser for dynamically loading a config file seems like something that would be better left in a shard.

Perhaps the stdlib should only include the ability to configure loggers in code, and then a shard can add functionality to read a config file and map that to the configuration exposed by the stdlib.

This would avoid having all Crystal programs read a crystal-log.yml file, and give application developers the option to embed the config in the binary (or load if from another source).

Even if it does all end up in the stdlib, I would really like to see the ability to define all this config in code instead of relying on a file or environment variable.

RX14 commented 4 years ago

I agree, I really really love this design, it's everything I ever wanted. Thank you so much to @bcardiff and @waj. My only comment was to request that the yaml configuration reader live in a seperate (blessed) shard, which @willhbr already covered :)

paulcsmith commented 4 years ago

@bcardiff Thanks for the thorough outline. I'm really digging a lot of this stuff. I have a few things I love and a few concerns/suggestions.

Implicit Context

This is awesome. I think the way you've outlined it sounds perfect. Shared using a fiber, easy to use, easy to copy to other fibers if needed. Awesome!

Rename to Log

I think it reads nicely and is a bit shorter. The perfect name IMO šŸ‘

Consistent interface

I'm loving that you just always use Log everywhere. That is a very awesome innovation :)

Using yaml config, and string based log and backend getters

I think this will make it much harder to detect issues with yaml config, code, and documentation since it is not as well type-checked as real Crystal code.

Here are the types of problems I think people will run into

Typos and unused/incorrect arguments

# Typos, especially for Log in shards since it is 
# harder to see what loggers are available unless it is well documented by the shard
Log.get("db.queyr") # Whoops, runtime error

This also applies to the yaml config:

backends:
  file: 
    # Typo in arg name. I'm guessing this would be ignored or need special code to catch stuff like this
    file_pattrn: app.%d{yyyy-MM-dd}.log 
    location: /var/logs/
    max_file_size: 10mb
    # Another spot where typos and inconsistencies can come up
    formatter: Log::DefaultFormatter 

One more thing to learn

There are now multiple ways to get/configure things. You can reference the class, or the string name. Not a big deal at all, but a bit of a downside.

Documentation

It will be harder to document what loggers are available and what they are called since the string name of the logger is not exposed in docs with the current implementation. You'd have to manually document it. Maybe a macro/annotation could be added to make it easier, but this is still extra work to get something that could "just work" if done with pure Crystal.

Another issue is with documentation of backend options. Since the options are just generic types the shard author will have to manually document all the options and expected types of those options. They will also have to make sure to keep them updated if the options change.

Proposed solution

I think we can make a lot of these things available "for free" by taking a slightly different approach.

Using concrete constants and methods

I would propose configuration with code by using an actual constant (module or class, no preference) and configuring those in code rather than in YAML. I think this will make for a program that has much better compile-time guarantees, documentation, and flexible configuration.

Here's roughly what I think that would look like:

module DB
  Log = ::Log # no need to give it a name, instead reference it directly
end
# Only allow changing via constant, not string lookup
DB::Log.level = :verbose # or Log::Severity::Verbose
# Configure logger in code
Log.level = :info
Log.backends = [MyCustomBackend, StdioBackend]

# The cool part is you can easily use conditionals, ENV, and all other Crystal features
if ENV["CRYSTAL_ENV"]? == "production"
  Log.level = :info
else
  Log.level = :debug
end
Log.backends = [MyCustomBackend.new, StdioBackend.new]

How to share log defaults

Let's say you want a base db logger and a few loggers (like db.pool in the examples) that live "under" that logger.

We could use constants here, much like proposed, just without the string names

module DD
  Log = Log

  class Pool
    Log = ::DB::Log
  end
end

Then configure the defaults for DB and optionally set overrides for the Pool logging:

DB::Log.level = :debug
# And override specific ones if you want
DB::Pool::Log.level = :info

Admittedly I have not done a deep dive on how it would inherit the config. I assume that's been figured out already but if not I think it is definitely possible. May need a bit of macros to inherit the settings but I am unsure

The main advantages with this approach are:

Backends

I think that the backends are basically an IO but need a few like allowing config through YAML and formatting data before writing. By using some built-in Crystal awesomeness I believe we can get better compile-time guarantees for arguments, better documentation of allowed arguments, and (maybe if we're lucky) you can use any IO (like file IO, etc.) as a starting point. This may already be possible with the existing implementation but I'm not sure...and it is ok if that's still not possible.

module Log::Backend
  macro included
    class_property formatter : ::Log::Formatter = Log::DefaultFormatter
  end

  # some_method_for_formatting_input_before_writing
  # abstract def write, etc.
end

# Maybe you can inherit from any other IO by using a module that you include
# instead of a class you inherit from? Not sure, but either way module is more flexible IMO
class FileBackend < File
  include Log::Backend # This would set some class properties like `formatter`, and other methods needed by Log
  # Add extra config with class properties
  class_property? skip : Bool = false

  def write
    # In real life you'd do something actual useful, but this shows the concept
    unless skip?
      # write
    end
  end
end

# Configure it
FileBackend.formatter = MySuperSpecialFormatter
FileBackend.skip = true

This is all a bit of pseudo code but the important parts are:

New feature suggestion: key/value data

One of the main reasons I wrote Dexter was to accept key/value data so that you can format things in hugely different ways. So you can do logger.info(user_id: "foo:") and format it any number of ways json, splunk style, etc.

Maybe this is already a part of the proposal and I missed it, but I think this is hugely important to making formatters and logging far more flexible. For example Dexter has a JSONformatter, but we also have a Lucky formatter that makes logs much more digestable by humans. It would not be possible without key/value pairs.

What we do when passed just a string is we put put it under a message key. That way you can still pass strings. Formatters can omit the key if they want, or ignore keys completely, but if you do want to print keys differently then you can.

Would you be open to adding that? It was fairly simple to add to Dexter and opens up a lot of possibilities.

I'm super stoked about this initiative!!

asterite commented 4 years ago
module DD
  Log = Log

  class Pool
    Log = ::DB::Log
  end
end

Maybe this is a typo, but here Log always refers to the same constant, so I can't understand how this results in different log instances, nested instances and so on.

What I hope Log.get has is that it can accept any object and just call to_s on it. Or maybe just String and Class, so you could be able to do:

module DB
  Log = ::Log.get(self)
end

Then DB::Log's name is "DB", automatically inferred from the type name.

paulcsmith commented 4 years ago

@asterite you are right. Good point. I like that idea a lot. That way you don't need to set it yourself. You could also still configure it with Crystal's type-safety goodness šŸ‘ I dig it. I imagine the interface can stay largely as planned that way

I'd suggest changing Log.get to Log.for(self). Since it isn't really getting a new one, and that way it discourages trying to use it as a getter in other places. But this a minor thing so totally fine if no one likes that šŸ¤£

module DB
  Log = ::Log.for(self)
end

DB::Log.level = :info
watzon commented 4 years ago

I do agree that using a yaml config might not be the best idea. May as well just have the log config happen in code, no?

bcardiff commented 4 years ago

Should the yaml config be a separate shard?

We aim for a design that will allow injecting ways providing the configuration. But without having one proposal, it will be chaotic for the user experience. How do I configure logging? It depends on the shard. And there is no default. That will cause a lot of stress because you will need to learn how each app or framework decide to do things. While that will still be an option, it is not required for all the apps.

Typos and unused/incorrect arguments

How to detect typo vs not-binded source? I'm not sure there is a way to solve.

The only thing I imagine is something that will emit the paths requested by the user, maybe a Log.verbose in log source when creating a log :-P. And leave the user to notice all the paths that are potentially used. Of course it won't work on dynamic sources.

Issue in wrongly config backends

It will be up to the backend, in the initialization to complain if something is wrongly configured.

The config type will support a narrow set of types similar to JSON::Any, but without schema validation.

Using concrete constants and methods

It does not work since Log = ::Log will not create a new instance; they are making an alias. It's hard to allow users create their own loggers but also constraint all the available options.

(ok, @asterite covered)

Backends

For IO based backends there might be some code reutilization available, but not all the backends are IO and formatted oriented. Elasticsearch could store the whole entry object.

New feature suggestion: key/value data

That would be the usecase of a context. We thought of adding some overloads to extend the context at the same time the message is sent, but we didn't reach an option that make us happy enough.

We also want to enforce there is always a message per log entry. Not just context data.

asterite commented 4 years ago

Does this mean I won't need to do require "yaml" to work with YAML because Crystal's runtime will already require it? (which also means YAML needs to be parsed and compiled on every app)

bcardiff commented 4 years ago

Oh, the issue is the libyaml dependency.

Ok, by default it will only read the CRYSTAL_LOG_LEVEL/CRYSTAL_LOG_SOURCES env variables and if require "log/config" is included in the app the yaml configuration will be available.

jkthorne commented 4 years ago

I think it would be nice to not have YAML be a big of a part of the crystal project as it is now. I YAML has kind of won out. Here is a breakdown of some config languages and I like languages written in Crystal like Con. It would be nice to not have libyaml be required by a crystal app but it might happen.

straight-shoota commented 4 years ago

@paulcsmith It is a necessary requirement that logging can be configured at runtime. Sysadmins need to be able to configure complex logging setups without having to rebuild the binary. That also means it's simply not possible to ensure type safety.

That being said, we can obviously expose an internal interface to the configuration. This would be useful for example for setting up default config. Or if your app doesn't need runtime configuration.

@all Please let's not get into detail about choice of configuration format and other specifics. The primary goal right now should be to discuss and implement the general architecture. Details of automatic configuration is two steps ahead. Contemplating on that only keeps us from discussing what's on the table right now.

paulcsmith commented 4 years ago

Thanks for the response @bcardiff

Should the yaml config be a separate shard?

We aim for a design that will allow injecting ways providing the configuration. But without having one proposal, it will be chaotic for the user experience. How do I configure logging? It depends on the shard. And there is no default. That will cause a lot of stress because you will need to learn how each app or framework decide to do things. While that will still be an option, it is not required for all the apps.

I definitely agree it should not be separate. There should be one sanctioned way to do things so config is not different for every shard. My suggestion is to not use YAML at all and use code instead. You can inject configuration using Crystal code, and if you really want to you can ready YAML or ENV vars or whatever using Crystal, but that is up to the end user.

Typos and unused/incorrect arguments

How to detect typo vs not-binded source? I'm not sure there is a way to solve.

I think this could be solved by only allowing config via the constant. Then you will get compile time guarantees from Crystal, but maybe I am misunderstanding. I think the name binding is only helpful for YAML, but if that is not used then it is no longer an issue.

The only thing I imagine is something that will emit the paths requested by the user, maybe a Log.verbose in log source when creating a log :-P. And leave the user to notice all the paths that are potentially used. Of course it won't work on dynamic sources.

Issue in wrongly config backends

It will be up to the backend, in the initialization to complain if something is wrongly configured.

The config type will support a narrow set of types similar to JSON::Any, but without schema validation.

I think this is not making the best of Crystal's awesome type-system and compile time guarantees. If instead it is configured in code (not from YAML) you get nice documentation on what args and types are used, and you get nice compile time errors pointing to the exact problem and where the code is that caused it.

Using concrete constants and methods

It does not work since Log = ::Log will not create a new instance; they are making an alias. It's hard to allow users create their own loggers but also constraint all the available options.

(ok, @asterite covered)

Backends

For IO based backends there might be some code reutilization available, but not all the backends are IO and formatted oriented. Elasticsearch could store the whole entry object.

That makes sense šŸ‘ But I still think that using a module instead of class is better. So that if you do have a base class you want to use, you can

New feature suggestion: key/value data

That would be the usecase of a context. We thought of adding some overloads to extend the context at the same time the message is sent, but we didn't reach an option that make us happy enough.

We also want to enforce there is always a message per log entry. Not just context data.

Can you explain a bit more about why you want to enforce a message per log entry? At Heroku and in my own apps we almost always use key/value data for everything: Log.info(operation_name: "SaveUser", author_id: 123) with no particular "message". We just want to log some info about how things went. Would that be possible in the proposed solution?

Under the hood I imagine it could use existing constructs:

def log(data)
  Log.context.using do
     # Set the context data
  end
end

About YAML

I think I'm confused because I don't see the advantage of using YAML over Crystal code. Crystal code is safer at compile time, way more flexible (can use ENV, can use conditionals, etc). But maybe I'm missing why it is useful. Could some elaborate?

paulcsmith commented 4 years ago

@paulcsmith It is a necessary requirement that logging can be configured at runtime. Sysadmins need to be able to configure complex logging setups without having to rebuild the binary. That also means it's simply not possible to ensure type safety.

All my examples showed configuring logging at Runtime. I was never suggesting anything else

# At runtime
Log.level = :info
MyBackend.formatter = MyFormatter

You can also use ENV variables if needed to allow even more flexibility. But maybe I'm misunderstanding what you mean. What does YAML give us that Crystal code does not? If we add a less compile safe and less flexible way of configuring logging I think it would be good to know what the advantages are.

That being said, we can obviously expose an internal interface to the configuration. This would be useful for example for setting up default config. Or if your app doesn't need runtime configuration.

@ALL Please let's not get into detail about choice of configuration format and other specifics. The primary goal right now should be to discuss and implement the general architecture. Details of automatic configuration is two steps ahead. Contemplating on that only keeps us from discussing what's on the table right now.

I don't believe this is true. The choice to use YAML affects the interfaces. If we use config in Crystal then we can use concrete types and arguments, and don't need to "register" backends or set a name for pools or do validation of backend args at runtime. Most all of it can be done by the compiler. It also means configuration of backends can happen without a hash type and can include a more concerete set of types and type arguments. If we can do everything in code that we can in YAML then I don't think we should use YAML. But if YAML gives us something awesome that's great! I'm curious what it gives us though

straight-shoota commented 4 years ago

@paulcsmith Log.level = :info is not runtime configuration because it is defined at compile time. Configuration written in Crystal can't be changed without rebuilding. You could set up some flags to select between different config sets. But configuration for an entire logging setup can be pretty complex with maybe dozens of backends and rules. No in-code configuration can provide the necessary flexibility for that.

Which data format to use for that is really not the point right now. The important part is, there needs to be some kind of configuration that's not done in code while compiling an application. Instead, a user/sysadmin/devop should be able to edit a simple text format to configure the logging behaviour before executing the app. Or even change it without restarting the application.

Strictly, this all wouldn't even need to be in the default stdlib implementation (although there are good reasons for that, as per @bcardiff ). But the stlib configuration mechanism must be flexible to support this. It can't just work with compile time safety, because there is no such guarantee when configuration values come from outside the compilation process (i.e. at runtime).

paulcsmith commented 4 years ago

@straight-shoota Ah I think I am starting to see what you mean. However even with YAML you would have to restart the app. Unless we also add some kind of file tracking that reloads configuration automatically when the YAML changes, but that seems a bit much.

Personally I've never worked with anyone making such signicant changes to a running system like you are proposing. If there are changes it is done very purposefull and an ENV var is introduced to configure particular parts without requiring a redploy (just a restart).

But also, I don't think we should shut down this conversation until we have a clear understanding on both sides, because I think that will help us come to an really great solution and better understanding of how people are using this šŸ‘

paulcsmith commented 4 years ago

After some discussion with @straight-shoota i understand the argument for yaml now, and also get where I did a bad job with my examples. Thanks for helping me work through some of that.

I think we can do some kind of approach that blends the best of both worlds and good to have concrete examples tomorrow that we can talk about and see if people like it/hate it

RX14 commented 4 years ago

I'm fine with the yaml config being in the stdlib if it requires an explicit require "log/config". require "log" shouldn't introduce any runtime behavior by itself. This means that shards can require log safely, always.

I'm glad we mostly seem to be agreed on the core proposal anyway.

paulcsmith commented 4 years ago

@bcardiff After a few discussions with others I realized why people may need YAML config, and I also realized I might have misunderstood what affect YAML config would have on the Logger proposal.

My original thinking was that the YAML config would affect the backends (and possibly the Log classes). I still think that may be the case, but would love clarification.

I thought based on the examples that backends would be required to have an initializer that accepts Log::Backend::Config. Is that the case?

For example, I believe I could not do this because it does not know how to accept the YAML config:

@[Log::RegisterBackend("custom")]
class MyCustomBackend < Log::Backend
  def initialize(path : String)
  end
end

MyCustomBackend.new(path: "/foo/bar")

Can you confirm if that is correct or not? If it is I've got some ideas. If you all are set on the current approach that is ok. LMK and I'll back off :) But if you're still open to ideas I think I have some that may be pretty cool!

bcardiff commented 4 years ago

I thought based on the examples that backends would be required to have an initializer that accepts Log::Backend::Config. Is that the case?

Yes

The yaml config would be

backends:
  custom:
    path: /foo/bar

And the code something similar to:

@[Log::RegisterBackend("custom")]
class MyCustomBackend < Log::Backend
  @path : String

  def initialize(config : Log::Backend::Config)
    @path = config["path"].as_s? || DEFAULT_PATH
  end
end
straight-shoota commented 4 years ago

@bcardiff IIUC, the initializer accepting Backend::Config is only required when it's supposed to be used with dynamic configuration. Also, it doesn't mean that there may not be a specialized initializer to expose a dedicated API for programmatical configuration. I suppose that's what Paul's interested in.

Blacksmoke16 commented 4 years ago

I'm assuming there's reason this couldn't be done with YAML::Serializable? That would remove the need for an "Any" style object.

Something like I mentioned https://github.com/crystal-lang/crystal/issues/5874#issuecomment-587167789.

paulcsmith commented 4 years ago

@bcardiff Thank you for clarifying, and yes @straight-shoota that is partially true. I could add an intializer that is better suited to code, but the default initializer for the Backend::Config will be there. So if I write a shard, I would need to implement both otherwise the backend would not work for people with YAML. I suppose I could just leave the YAML initializer and tell people to use the code version but then that seems unfair to those using YAML and very unexpected since the method is there but just wouldn't do anything. @straight-shoota here's kind of what I mean:

# Leaving out boilerplate
class MyBackend
  def initialize(@foo: String)
  end

  # If someone tries to set `foo: "bar"` in YAML it will not be set 
  # unless I add an initializer to handle the YAML as well:
  def initialize(config : Config)
    new(foo: config["foo"])
  end
end

I've got some ideas, but will try to write up actualy samples since I'm bad at explaining in words šŸ¤£. Would also be happy to hop on a call sometime if I'm still talking nonsense! The rough idea is that I think YAML config should be talked about separately and Crystal should have a way to configure modules in stdlib or with a shard. So all shards can use it, not just the logger. And because of that, I think the Log implementation would be better off as code-first and the config idea I have would kind of work "automatically". This would lead to more type-safety when configuring with code, less duplication, and better docs (since a code-first approach means you can set the exact named args and types you want)

straight-shoota commented 4 years ago

Another issue, that hasn't been touched yet: It would be great to be able to modify configuration settings (especially source/severity routing) during the runtime of a program. That doesn't need to be part of the initial implementation, but it should be possible to add later. Right now Log is immutable, but I figure it would technically allow modifications. This should probably work through the builder revisiting its logger list. Not a public API to modify an individual instance.

The use case would be long-running applications where you want to change for example the severity level or log destination while the application is running. It shouldn't be necessary to restart the application just to update the logging configuration.

paulcsmith commented 4 years ago

Ok I gave a shot at writing it down šŸ¤ž

My goals for this comment

Proposal for code-first approach and no YAML (at first)

First I'd like to show why I think a code-first approach makes for safer, better documented, and more flexible code. I will address my thoughts on the YAML config later in the document.

# Leaving out boilerplate for now
class MyFileBackend < Backend
  # The advantage of this is:
  #
  #   * Great and automatic documentation of the types and args!
  #   * Awesome compile time messages if you typo, give it wrong type,
  #     forget an argument
  #   * Doesn't require YAML
  def initialize(@file_path : String)
  end

  def write(message)
    # pseudo code
    @file_path.write_to_it(message)
  end
end

# Of course if YAML is kept as proposed we *could* do this in code.
# The problem is:
#
#   * Litte type-safety. `{"fil_path" => "foo"}` would not fail at compile time
#   * Validation of args must be hand rolled by the backend instead of the compiler
#   * The author must manually write documentation on what args it accepts
#   * Usage in code is not as smooth
MyFileBackend.new({"file_path" => "some-path"})

# As opposed to this:
#
#   * Compile time guarantees
#   * Can accept procs if needed
MyFileBackend.new(file_path: "some-path")

An author could implement a code approach and a YAML approach in the Backend, but that means:

Ok, but what about people that do need YAML config?

My proposal: a totally separate issue/PR for Crystal configuration for all libs (not just Log)

How would it work

I think this can all be discussed in a separate PR and Log can move forward with a code-first approach, but I want to show a potential approach just so people can see how this might work.

This approach has been tested/used extensively in Lucky with Habitat and works wonderfully. I think we can extend it so people can use JSON/YAML with it too!

Something like this will allow:

The Settings module (or whatever we want to call it)

First I'll start with non-Logger code to show some examples:

class MyStripeProcessor
  Config.setup do
    # These set up class methods and instance methods (that call the class methods)
    setting private_key : String
    setting callback_url : String?, example: "https://myapp.com/strip_webhook".dump # Dump wraps it in quotes
    setting some_object : SomeObject?, example: "SomeObject.new"
  end

  def charge(amount)
    # Settings are usable in instances of an object
    Stripe.charge(amount, private_key, callback_url)
  end

  def self.charge(amount)
    # And as class methods too
    Stripe.charge(amount, private_key, callback_url)
  end
end

And to configure it in code:

MyStripeProcessor.configure do |settings|
  settings.private_key = "Something-secret" # or ENV["SO_SECRET"], etc.
  settings.some_object = SomeObject.new
end

I'll get into how we make it so you can make the types non-nilable when not using an initializer later, but suffice to say there is a pretty good solution.

Great, but what about YAML/JSON/non-code approaches

I have not done this with Habitat (what my proposal is based on), but I think we could totally make it work in some really cool ways.

require "yaml_config"

# Call this instead of (or in addition to) the code approach
YAMLConfig.load("config.yml") # Or ENV var, etc.

# Can still modify with code for some libs
MyShard.configure do |settings|
 settings.a_proc = -> {} # Procs can't be created from YAML so do it in code
end

# We'll talk about this later
Config.raise_if_missing_settings!

The YAML in "config.yml":

# Inferred from class name. Maybe we can allow specifying a name: `Config.create("stripe")`
my_stripe_processor:
  private_key: "123-abc"
  callback_url: "myapp.com/foo"
lucky_server:
  secret_key_base: "123-abc"
my_backend:
  file_path: /foo/bar/log
# This may require tweaking as it is not tested, but I believe we can do something like this
# Maybe by adding optional processors to YAMLConfig.load
# Or having a `Log::YAMLConfig` that just looks at `bind_logs` and loads it up. These are implementation details that I'm sure we can figure out
bind_logs: 
  sources:
    - db.*
   level: WARN
   # etc.

I think this is pretty cool. You get powerful YAML config automatically, but you also get type-safe code configuration that you can use without or in addition to YAML. You could also see a JSONConfig.load, etc.

I think the Config.create macro could have some hooks for loading in YAML/JSON config or whatever. THis extensibility means people could do all kinds of cool things. Like loading from Hashicorp Vault, or whatever they want. But they can all be seprate shards.

Some other nice stuff with this approach

Screen Shot 2020-02-25 at 11 35 56 AM
MyStripeProcessor.temp_config(callback_url: "fake_url.com") do
  # Callback_url will be fake_url.com in this block and revert afterward.
end

Concerns

Example with Backend and new setting class

class MyBackend
  Config.create do
     # I've got ideas for how to modify processors per-logger, but this is jsut a simple example for now
     setting.file_path : String
  end
end

In conclusion

I propose:

@bcardiff, @RX14, @straight-shoota and other core team: If this sounds good, LMK and I'll get started on the Config PR/issue. I'm confident we can make something incredibly cool and more nice to use/flexible than anything out there for config! :)

bcardiff commented 4 years ago

.. the initializer accepting Backend::Config is only required when it's supposed to be used with dynamic configuration.

There is nothing preventing you to initialize a Log and Log::Backend manually, but they will be out of the loop of the Log.for factory method (?)

.. So if I write a shard, I would need to implement both otherwise the backend would not work for people with YAML.

My idea of the Log::Backend::Config is to be independant of the format used for configuring it. Maybe someone prefers a json o toml based configuration, but we don't want to make them reinvent the mechanism of instantiating and wireing logs and backends, hence the builder.

Another issue, that hasn't been touched yet: It would be great to be able to modify configuration settings (especially source/severity routing) during the runtime of a program. That doesn't need to be part of the initial implementation, but it should be possible to add later.

waj and I discarded that requirement. We usually restart the app on a config change, even the logger. But yeah, something like keeping track of the log to reconfigure them could be done without changing the API. The only clash is when, during development you manually changed the log level. Upon reconfiguration, which should be the final one.

I think this will be the responsibility of the builder, and signaling a reload might need some new public api to do so. Doable but discarded for now based on personal experience.

We modify the Log PR to use a code-first approach (no YAML config or Config alias) and focus on getting that out.

Even if macros are used to introduce a config notion, I don't see how you will get rid of Log::Backend::Config. You might be able to hide it, but removing it completely, I am not sure you can.

Having a Log::Backend::Config with nothing else allows you to code validation that can't be expressed by only typing rules.

I like the story of having a configuration mechanism, but it requires some design cycles still.

The current proposal does not require yaml at all, but there is an implementation that can use that format. In the same spirit, having a config macro in the backends could wire that mechanism on the loggers.

(I need to keep thinking on this topic)

paulcsmith commented 4 years ago

There is nothing preventing you to initialize a Log and Log::Backend manually, but they will be out of the loop of the Log.for factory method (?)

Yes you can, I just think it is less than ideal because you have to give it a hash which is not particularly type-safe, worse documentation, etc. as mentioned in the example

My idea of the Log::Backend::Config is to be independant of the format used for configuring it. Maybe someone prefers a json o toml based configuration, but we don't want to make them reinvent the mechanism of instantiating and wireing logs and backends, hence the builder.

I love this idea! I just fear that by doing this we make things worse for documentation, and a code-first approach. If you do a code-first approach you can still do more validation outside the type-system. The settings stuff can map YAML/JSON to code without a Backend::Config. At least, I'm fairly certain it can. Can you share why you think that is necessary?

My main thought is that it can be done without Backend::Config and would still be able to work with YAML or JSON. Would be happy to chat or pair through it! I'll also take a look at the PR to see if I maybe that sparks an idea

straight-shoota commented 4 years ago

I might be missing something here, but what do we need Backend::Config for exactly? It's basically just a facilitator between config format and implementation. Functionally it seems to not be really necessary. Instead we could just pass the YAML parser directly the backend implementation. This direct dependency might not b ideal, but I would prefer that over a hardly useful interface in between. Supporting other config formats would be more challenging, because they would need to build custom mappings to the existing backends. But I'm not sure whether that would be such a big issue at all. And maybe this would be another incentive to develop a cross-format serialization framework like serde =)

paulcsmith commented 4 years ago

@straight-shoota That makes sense. I think this could be further improved however using the example above. Having a config DSL that would allow code/YAML/JSON whatever. That way you do not need to really worry about it.

I have a proof of concept here that does not need YAML or the intermediary Backend::Config type. This would greatly simplify the implementation of the Log PR because you could split the Log PR from config almost completely.

Roughly what I think it could start out as:

# First set of Log PR changes. Consider using regular Crystal:
class OriginalLogBackend
  def initialize(@file_path : String)
  end

  # or
  class_property! file_path : String?
end

# my_backend = OriginalLogBackend.new(file_path: "/foo/bar")
# Log.for(DB::Pool).backends << my_backend

# Though this doesn't have YAML log config, it is still much better than what we have today! 
# We get context, nicer 'Log' name, etc.
# We miss the fancy config (for now), but we can build that in as a separate step.

What it could be later on once we've nailed a nice config PR

Note this is UGLY and not even close to production ready but it shows how a simple config DSL could be used to handle YAML/JSON/whatever without a Config hash (just code) and instead using class properties (or could be an initializer or whatever). Also shows you can hook into it to build YamlConfig, JsonConfig, YamlLogConfig (for cool stuff like bind) etc.

This is not to get feedback on how exactly it works. I just wanted something to work as soon as possible to show that it is possible to do config without the Backend::Config type.

require "yaml"

# Proof of concept for how we can use an approach for code/yaml/json
# Similar to Lucky's Habitat, but with coolness for handling YAML config
module Config
  TYPES_WITH_CONFIG = [] of Config

  macro included
    {% TYPES_WITH_CONFIG << @type %}
    CONFIG_KEY_NAME = {{ @type.name.stringify.underscore }}

    def self.load_runtime_config(config)
    end

    # This is bad and inflexible, but was easy to do. Don't judge based on this :P
    def self.format_config_value(config_type : String.class, value) : String
      value.as_s
    end

    def self.format_config_value(config_type : Int32.class, value) : Int32
      value.as_i
    end

    def self.format_config_value(config_type, value)
      raise "Don't know how to handle #{config_type} with value #{value}"
    end
  end

  macro setting(type_declaration)
    class_property {{ type_declaration }}

    def self.load_runtime_config(config)
      previous_def(config)

      # Load known config keys and cast to the correct type automatically
      if value = config[{{ type_declaration.var.stringify }}]?
        @@{{ type_declaration.var }} = format_config_value({{ type_declaration.type }}, value).as({{ type_declaration.type }})
      end 
    end
  end
end

# This could also be a JSON loader, a VaultLoader, or whatever!
module YamlConfigLoader 
  macro load(yaml_string)
    yaml = YAML.parse({{ yaml_string }})
    {% for type_with_config in Config::TYPES_WITH_CONFIG %}
      if values = yaml[{{ type_with_config }}::CONFIG_KEY_NAME]?
        {{ type_with_config }}.load_runtime_config(values)
      end
    {% end %}
  end
end

class LogBackend
  include Config

  setting retries : Int32 = 5
  setting file_path : String = "foo"
end

puts LogBackend.file_path # foo
puts LogBackend.retries # 5

YamlConfigLoader.load <<-YAML
log_backend:
  file_path: "set/with/yaml"
  retries: 10
YAML

puts "---after yaml config is loaded"

puts LogBackend.file_path # set/with/yaml
puts LogBackend.retries # 10

https://play.crystal-lang.org/#/r/8mhp

So I'm thinking we could do a log implementation without the Config type at all at first and instead use regular initializer or class_property. Then tack on config later that works for logger as well as all Crystal code.

Another alternative is move forward with this YAML approach for now with the Backend::Config type, and then later on clean it up once we have Crystal config code figured out. That could also work but seems like duplicated work if work on one version now, and then have to redo parts of it later. Thoughts? I think either way could work, but IMO going without the YAML config for the first pass would be easier/simpler. But up to you of course! @bcardiff

I think either way I'll take a crack at some config code that will allow us to do this kind of thing but far more flexibly and with the other goodies mentioned before like nice error messages and such

straight-shoota commented 4 years ago

That whole general config thing is a bit overwhelming, unfortunately. I'm not sure about what to make of it, because I see a lot of potential issues with such an aproach that tries to solve everything for everyone. It's definitely a nice idea, though. And maybe it's not that different from a general serialization framework... Anyway, it's really out of scope here. It might be a good idea though to focus on Log first, but without the dynamic configuration part. Divide and conquer.

paulcsmith commented 4 years ago

It might be a good idea though to focus on Log first, but without the dynamic configuration part. Divide and conquer.

Totally agree! Just wanted to show it is possible so that we can remove the dynamic configuration/YAML stuff in this PR (if you all want) and be confident it can be handled separately