elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
69 stars 3.5k forks source link

Richer Plugin Configuration Syntax #3773

Open andrewvc opened 9 years ago

andrewvc commented 9 years ago

Currently working with configurations is awkward for a few reasons

@ph mentioned having a full separate configuration class, I agree. I would like some sort of Configuration object that could be used something like in the example below. I think the syntax here could be better, but I would like all the capabilities used here.

class HTTPClient
  include LogStash::PluginConfig

  config :http_host, :default => "127.0.0.1", :doc => "The HTTP Host"
  config :http_port, :default => 80, :doc => "The HTTP Port"
end

class HTTPZipper
  include LogStash::PluginConfig

  config :foo, :default => "bar", :doc => "Some option"
  config :http_client_a, :nest => HTTPClient.config
  config :http_client_b, :nest => HTTPClient.config
  config :my_nested_config, :nest => {:foo => {:default => "baz"}, :bar => {:default => "bot"}}

  def do_something
    @foo = new Foo(config.foo)    
    @http_client_a = new HTTPClient(config.http_client_a)
    @http_client_b = new HTTPClient(config.http_client_b)
    @my_nested_thing = config.my_nested_config
    # Looks like {"foo" => "baz", "bar" => "bot"
  end
end
ph commented 9 years ago

This changes could give a few nice side effect:

ph commented 9 years ago

I've created this a few month ago, I think it's still related. https://github.com/elastic/logstash/issues/1890

andrewvc commented 9 years ago

@ph this is also nice for writing plugins in Java where mixins really don't exist.

ph commented 9 years ago

The problem with having a :docsoption, it will make writing nice doc a bit harder with the formatting.

ph commented 9 years ago

@andrewvc maybe we could explore extending yard to generate the doc. I don't know, I am just throwing ideas here.

andrewvc commented 9 years ago

@ph what if we had a format like:

      config "host", :default => "127.0.0.1", :validate => "string",
             :doc => <<-EOD
        Name of the HTTP Host. Lorem Ipsum Sit Dolor Amet. Blah Blah Blah.
      EOD

or

      config "host", :default => "127.0.0.1", :validate => "string",
      config_docs "host", <<-EOD
        Some long docstring
      EOD

Plenty of space for docs with heredocs

andrewvc commented 9 years ago

@ph my main motivator here is the ability to nest docs might get crazy unless they're declared as actual values.

One other nice feature is that we could more easily include the docs in error messages!

ph commented 9 years ago

Not sure about heredocs, but having the actual doc in the errors message would be a big plus.

ph commented 9 years ago

I have added a tentative target of 2.0 since it could break things.

ph commented 9 years ago

Adding support for this kind of behavior (Proc was supported, but I am not sure the state of it..) This would allow chaining and reuse of validations to make more complex rules. This would isolate testing and allow us to use them in the java classes.

#:string => StringValidator
config :username, :validate => :string, :default => "ph"

#ip => IPAdressValidator
config :ip, :validate => :ip

config :complex_concept, :validate => MyCustomValidator

config :more_complex, :validate => -> (field) { field > 100 }
ph commented 9 years ago

@andrewvc I think we could try to create a shim of what a plugin could look like, I don't think we need to implement anything on the logstash side. But this would allow us to try syntax and do code organization experiments and get feedback from the team?

andrewvc commented 9 years ago

@ph I think that's a great idea. Maybe we could pair on it at some point next week?

ph commented 9 years ago

@andrewvc Absolutely!

ph commented 9 years ago

I suggest the rabbitmq, since it shares a lot of problem I have with the aws-mixins but has less code in it.

guyboertje commented 9 years ago

There is prior art in Virtus. https://github.com/solnic/virtus The Mongoid library also deals with attribute definitions, validations and defaults. https://github.com/mongodb/mongoid. The ROM project also solves this. http://rom-rb.org/guides/

We should plunder as much as we can from the above projects

ph commented 9 years ago

@guyboertje Agree, there is one in datamapper too.

guyboertje commented 9 years ago

The ROM one uses a newer impl of Datamapper

purbon commented 9 years ago

I like the virtus and the datamapper way! :+1: I agree!

guyboertje commented 9 years ago

Obviously its not what you see on the surface that counts but more their approach underneath to flexibility and extensibility that really matters. We run the risk of needing to discover and solve the same problems they have already overcome.

guyboertje commented 9 years ago

The analogue here is that for some of the libraries above the source is table definitions in a DB and ours is the config. However in the table definitions, each field has more information.

There is some considerations/implications here when talking about JSON based config, or the ability to generate a GUI representation of the config and the reverse.

andrewvc commented 9 years ago

@guyboertje I'm looking forward to looking into these! Thanks for finding the prior art here, there's no need to invent the wheel.

Any idea which of these would integrate well with Java? My initial thinking was we'd want to integrate well with something like the java Properties class.

Thinking through it a bit more we'll probably really want documentation as values vs. comments for the simple reason that we'll need a uniform doc generator between java plugins and ruby ones.

guyboertje commented 9 years ago

However upon reflection, our source of configuration is split. For each plugin, we have the canonical dsl definition of what is declarable, and in the pipeline config we have a declaration of usage. This is exactly the opposite of the datamapper situation - they have a serializable source of what is declarable and a dsl declaration of what is in use.

I seems to me that we need a central (JSON) text based declaration of all the possible config options for all plugins (this can be assembled from a file stored in each plugin repo). From this file, we can provide some Javascript for a GUI that presents the choices to users to guide them to a better config. And a GUI can be generated by the Javascript from the pipeline config using the declared config.

It also means that individual plugins can read this file a build a Config class that can interpret their section of the pipeline config better. We would not need the config dsl at all.

andrewvc commented 9 years ago

@guyboertje +1 to the idea of JSON.

I think we need to make this generated though. I really like the idea of a plugin pulling in another plugin's options for the mixin case. I think this is a powerful tool that I actually could find immediate use for.

Perhaps there should be a required method #config_spec that outputs this? We could have a build process generate the JSON list you mentioned by executing that.

ph commented 9 years ago

Oh +1 json

andrewvc commented 9 years ago

@guyboertje I'm still a fan of a DSL, but I think the DSLs should generate JSON as a common format to solve the dependent configuration issue (esp. across JVM langs). Additionally, that way each language can come up with its own config DSL. Also, being able to return the config DSL as a value could have value for a future Logstash gui (easily pull up docs on a plugin). Doing that with packaged JSON files is sometimes weird depending on the language in question (java resources are irritating to deal with).

colinsurprenant commented 9 years ago

I know @andrewvc mentioned it but I want to stress that whichever new strategy we look at must take into account the potential new plugins written in any JVM languages in the future.

guyboertje commented 9 years ago

@andrewvc @colinsurprenant - when a plugin is written in another JVM language it will inherit from a Java version of e.g. LogStash.Filters.Base which will provide the methods to validate/declare config.

The other JVM lang plugin author need to define an equivalent call to config :rename, :validate => :hash in their plugin class.

The argument FOR a DSL is that the reader can see at-a-glance what config is declared. ActiveRecord has no DSL, for the table fields and its hard to see what fields are in a model. For this there is the annotate gem which adds a structured comment to an AR model that contain a representation for each field definition.

We could do the same for the plugin source if we read a conf_def.json file, we would just need a Visitor to decorate the comment for Scala/Closure/Groovy/Jython etc.

andrewvc commented 9 years ago

@guyboertje what are your thoughts about my concern re: composable configurations. One plugin should be able to use a mixin's configuration on a key of its own, as in my HTTP client example in the first issue of this ticket.

I agree with you about JSON, but I think it should be generated off a plugin's config rather than hand coded.

guyboertje commented 9 years ago

@andrewvc - there is no clean way to represent included indirect JSON . For instance, we don't want / can't use the XML Entities idea.

And this sort of thing is very problematic.

{
  "name": "http_zipper",
  "config": [
    {
      "name": "foo",
      "default": "bar",
      "doc": "Foo option [default: 'bar']"
    },
    {
      "name": "http_client_a",
      "nest": "some/path_or_uri/to/http_client/config.json",
    }
  ]
}

So it looks like file -> Config is not the way to go. However, it must be remembered that an other JVM language plugin will need to interact with Plugin Base to declare, use and serialize configs.

andrewvc commented 9 years ago

@guyboertje I'm totally with you, I think we're on the same page. What I'm arguing for is a DSL that can generate a fully compiled config JSON to package with the gem/jar/whatever. In fact it could just be part of a generic plugin metadata file logstash-plugin.json or the like. Generating this would be the responsibility of the build tool.

As we move toward multiple language support standardizing on something like that seems like the only logical way to make doc generation and interop portable.

guyboertje commented 9 years ago

@andrewvc - we concur about chicken and egg, Config (DSL) and JSON file.

guyboertje commented 9 years ago

An off-the-wall idea that I had concerned the generated JSON file and the AST, to what degree would it be possible to use the generated JSON to pre-build 'maximal' AST subtrees for any plugin declared in the LS config and use the config to prune unused sections based on the declared config?

Or put another way. Will the generated JSON be of any use when building the AST?

Can a plugin be asked to add its AST subtree, when given a Node and its part of the config?