Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.37k stars 1.06k forks source link

Ability to send system logs into ES index #5925

Open gimmic opened 5 years ago

gimmic commented 5 years ago

I've started to notice signal-to-noise issues with the system overview page's centralized node logs. Every input change or general graylog cluster change results in at least 6 pages of repeating logs as each node does the same action. This makes paging through looking for actual information irksome.

Expected Behavior

There should be an option to tee off the graylog system logs to the backend ES in a dedicated index set.

Current Behavior

Graylog system logs are stored in mongodb. 35 graylog nodes * 3 messages per input change, deleting for example: STOPPING / STOPPED / TERMINATED = 105 log events when I delete an input, resulting in pages and pages of that event repeating. image

Possible Solution

I understand it could be problematic to ONLY store system logs into ES in the event the backend ES cluster is having problems. Potentially use mongodb as primary and also index into a dedicated index for graylog_system, allowing you to use the graylog ecosystem to explore its own logs.

Another option would be to summarize identical events across the cluster as something like:

2019-05-02T10:31:46-05:00 [35 nodes] Input [Raw/Plaintext TCP/5ccb0d543b3e1e7e1bbeb274] is now STARTING

Steps to Reproduce (for bugs)

Make a lot of graylog nodes. Do stuff. Look at the system overview page.

Context

I routinely check for things like indexes rollover rate and the easiest location to eyeball that is in the system overview page. It might be nice to also be able to use all the graylog feature set to review its own logs. Long term storage of graylog status logs also might not be best done in the configuration mongodb.

dennisoelkers commented 5 years ago

We were hesitating to store system/Graylog logs in ES to prevent feedback loops. We are thinking about making the system logs in mongodb available for analytics in the same was as we do with the ES indices.

gimmic commented 5 years ago

I had considered the feedback loop issue. Directly querying the mongodb does sound like a more streamlined option than duplication of logging events. A downside would be that long-term storage may be better suited for the primary logging infrastructure(ES) while keeping mongodb just for configuration.