Closed nathanleiby closed 8 years ago
cc: @mohit @burnsed
Build failure seems unrelated: Failed to clone repository: 'https://github.com/trink/struct.git'
Pushed an empty commit, and it fixed the build failure.
The main reason we don't allow hostname to be set in SandboxFilter plugins is as a security measure against dynamically injected filter code. In that scenario, we're allowing possibly untrusted code to run in the pipeline, and we don't want to allow a nefarious actor to be able to spoof other hosts.
I'm up for allowing hostname to be modifiable from non-SandboxManaged filters (i.e. this.manager == nil
), however. If someone is getting Lua code onto your host's file system and editing your config to use it, you probably have bigger problems than hostnames being spoofed in your Heka message... ;)
Also we'll want to add a note to the Features section of the CHANGES.txt changelog for the 0.11 release, and I'm pretty sure there's a place or two in the documentation that would need to be changed to reflect that non-dynamically injected filters allow hostname to be set.
Sorry to drop this PR. Closing since no longer necessary. Thank you!
This change allows the user to set a message's Hostname in a Lua sandbox filter. It maintains the existing behavior, unless the Hostname is explicitly overridden.
The behavior we'd like is to be able to filter messages (sometimes: enriching/modifying fields) and the re-inject nearly the same message after passing through the filter. We need to preserve the Hostname since its value is used directly in downstream Outputs. Ultimately, we need the hostname of the original log in order to properly do metrics/alerts/search/etc.
We're running Heka in a container and if you
inject_payload
to create a new message, you can't keep the original message's Hostname. Instead, the new message's hostname is Heka's container ID.In our pipeline, we use Hostname to mean "the host the log originated from".
It looks like the expected usage is "Hostname that generated the message" -- perhaps you can illuminate if we are using it improperly. It seems like e.g. with an RsyslogDecoder, the
preserve_hostname
option means that Hostname should align with the host that created the log... that's exactly how we're using it.Some core plugins rely on the Hostname of the Heka message, for example the ElasticSearch plugin pulls directly from this field.
We also use it in many of our custom plugins. (Example)