graylog-labs / graylog-contentpack-nginx

A nginx content pack for Graylog
Apache License 2.0
75 stars 32 forks source link

vhosts/servers #6

Open kwisatz opened 7 years ago

kwisatz commented 7 years ago

I'm wondering, am I missing how you could use these with multiple vhosts/servers?

I've modified the log_format to also have $host info, and adapted the extractors accordingly, and I'm willing to do a PR if there's interest in this, but I'm kind of thinking I must have missed something obvious?

jalogisch commented 7 years ago

@kwisatz we would like to receive your PR on this.

Then we can look at it to merge and comment if anything important is missing.

bestis commented 7 years ago

Well it's missing from the instructions at the first place... It should look more like this:

log_format graylog2_format '$remote_addr $host $remote_user [$time_local ] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_ x_forwarded_for" <msec=$msec|connection=$connection|connection_requests=$connect ion_requests|millis=$request_time>';

Then updating the collectors accordingly by having extractor like: { "condition_type": "regex", "condition_value": "^\\S+\\s+nginx:", "converters": [], "cursor_strategy": "copy", "extractor_config": { "regex_value": "nginx:\\s+\\S+\\s+(\\S+)" }, "extractor_type": "regex", "order": 0, "source_field": "message", "target_field": "http_host", "title": "HTTP host" },

Seems to do the trick. But you should also notice that you need to fix the remote user parse, as it's assuming that the host is "-", based your configuration example.

kwisatz commented 7 years ago

Sorry I still haven't gotten around to creating a PR for this. Let me at least start cloning the repo, I'll try to get around to it today.

bestis commented 7 years ago

I'm highly skeptical that the pull request will work correctly when using that vhost format and without using it, as then there is different number of fields and two different "" parts.

For example the message regexp was: "regex_value" : "nginx:.+?\\\"(\\S+.+HTTP\\/\\S+)\\\" \\d+" Not it is: "regex_value": "nginx:.+?\\\"[\\w|\\S]+\\\"\\s\\\"(\\S+.+HTTP\\/\\S+)\\\" \\d+"

Well, if one is using that old format without hosts like on the README, I think the extractors will fail then, at least it should have some "-" added, but then the backwards compatibility issue with people already using this. So adding new field doesn't seem smart in my option.

I know that the second "-" is for remote ident ("de facto standard says so", ref: http://httpd.apache.org/docs/current/logs.html#common), but no-one really uses it. As now it is just "-" in the log format. I would rather see the $host there, so the format would be more similar in both cases and the regexps can be more easily made to support both formats at the same time.

Also the people already using this could just continue use and update without any backwards compatibility issues, only problem would be if they updated the content pack but no nginx config they would get the host to be "-", before updating nginx config. As I don't see reason why not to log the host always, as there is so many others much much more less interesting information already logged.

So for these reasons, I would rather see the solution to be based on my comment: https://github.com/Graylog2/graylog-contentpack-nginx/issues/6#issuecomment-287789121

petestorey26 commented 6 years ago

This seems to have fallen away a bit - I'm trying to fix this same issue and I'd like to do it in a way that can be reused for everyone rather than having to create my own completely, but the above "solution" appears to basically be "replace the remote_user with a host entry" so that the format is still backwardly compatible?

Alternatively the JSON based solution proposed in the pull request comments seems to be a good one and then allows you to add arbitrary fields later if required, without managing extractors - which would mean just having a separate project?

petestorey26 commented 6 years ago

I've created a new one in the marketplace which works using JSON as described somewhere:

https://marketplace.graylog.org/addons/6b867cbe-8f5b-4fc9-84d2-fc1a75a0830d

Works well, you can add arbitrary fields and they just appear. Of course it's not backwards compatible with this one, hence the new project rather than the pull request.