preaction / ETL-Yertl

ETL With a Shell
http://preaction.me/yertl
Other
27 stars 4 forks source link

yertl command-line script interpreter #148

Open preaction opened 8 years ago

preaction commented 8 years ago

We could have a yertl command line script that works similarly to logstash:

#!/usr/bin/env yertl
use ETL::Yertl 'Script'; # optional, only needed if run via perl, default if run by `yertl` command
input file => '/var/log/httpd.log'; # Tails the file and allows for rotation by default
input zeromq => 'tcp://127.0.0.1:5000'; # Multiple inputs can be specified
input stdin =>; # Find a way to fix needing to quote/=>
filter grok => '%{LOG.HTTP_COMMON}'; # Filters are run sequentially, so grok should likely come first
output sql => driver => 'SQLite', database => 'httpd.db'; # Defaults to insert
output file => '/path/to/file', format => 'yaml'; # Defaults to default Yertl format
output stdout =>; # Find a way to fix needing to quote/=>
preaction commented 8 years ago

The default input should be default, and work like Perl's ARGV (STDIN + arguments). The default output should be stdout.

filter should take a subref to allow for Perl-based filtering.

input/output should both accept filehandles. Should that be considered a file even if it's a pipe or a socket?

preaction commented 8 years ago

input/output should all have format options. The default for output should be default. The default for input is trickier... grok requires lines, everything else requires documents. Probably, for consistency, we should use default as the default input, but perhaps we should, for ease-of-use, default to lines if grok is the first filter...

preaction commented 8 years ago

In the future, it'd be nice if input/output could have attached filters. So, the filter command should return something that can be used as an argument to input/output.

preaction commented 8 years ago

It would be nice to do filtering in forked processes for performance maybe...

preaction commented 6 years ago

This should still be done even as we create the new Perl API. The Script API should export three functions: input, output, and transform (which replaces the existing transform function). input registers a new input. output registers an output.transform registers a transform. Once the script is done, the pipelines are constructed:

Once the pipelines are constructed, the loop can be run.