Closed yasincyx closed 3 months ago
大佬,以下是我的配置
inputs: - Stdin: {} filters: - Grok: src: message pattern_paths: - '/opt/gohangout/grokpattern' match: - '%{DATA:TIMESTAMP} %{DATA} %{DATA:source_host} %{DATA:PATH} %{GREEDYDATA:LOGHUB_USERLOG}' failTag: message_grokfail remove_fields: ['message'] - Grok: src: LOGHUB_USERLOG pattern_paths: - '/opt/gohangout/grokpattern' match: - '\s*%{DATA} %{DATA} %{DATA} %{DATA:PATH} %{GREEDYDATA:json_data}' - '\s*%{GREEDYDATA:json_data}' failTag: LOGHUB_USERLOG_grokfail remove_fields: ['LOGHUB_USERLOG'] - Json: field: json_data remove_fields: ['json_data'] - Json: field: log remove_fields: ['json_data'] - Date: location: 'Asia/Shanghai' src: 'start_time' target: '@timestamp' formats: - '2006-01-02T15:04:05.999+0000' failTag: start_time_parsefail - Date: location: 'Asia/Shanghai' src: 'time' target: '@timestamp' formats: - '2006-01-02T15:04:05.99999999Z' failTag: dateparsefail outputs: - Stdout: {}
源数据为 1715420594643 75 h72-pp /var/log/containers/uw5ccfe9598bd2906f42975e4.log {"log":" \"TraceContextHeaderName\": \"trace-id\"\n","stream":"stdout","time":"2024-05-11T09:43:08.892780189Z"}
解析后为 {"@timestamp":"2024-05-13T15:20:50.4024003+08:00","PATH":"/var/log/containers/uw5ccfe9598bd2906f42975e4.log","TIMESTAMP":"1715420594643","json_data":" \\"TraceContextHeaderName\\": \\"trace-id\\"\n\",\"stream\":\"stdout\",\"time\":\"2024-05-11T09:43:08.892780189Z\"}","source_host":"h72-pp","tags":["start_time_parsefail","dateparsefail"]}
预期是想跳过那个log的不规则部分,后续的照常解析,但是用上述json 插件好像无法实现
而用logstash是可以的 { "source_host" => "h72-pp", "stream" => "stdout", "TIMESTAMP" => "1715420594643", "event" => { "original" => "1715420594643 75 h72-pp /var/log/containers/uw5ccfe9598bd2906f42975e4.log {\"log\":\" \\"TraceContextHeaderName\\": \\"ntes-trace-id\\"\n\",\"stream\":\"stdout\",\"time\":\"2024-05-11T09:43:08.892780189Z\"}" }, "@timestamp" => 2024-05-11T09:43:08.892Z, "time" => "2024-05-11T09:43:08.892780189Z", "@version" => "1", "host" => { "name" => "GIH-D-26809" }, "log" => " \"TraceContextHeaderName\": \"trace-id\"\n", "PATH" => "/var/log/containers/uw5ccfe9598bd2906f42975e4.log" } logstash 主要的部分解析是 json { source => "json_data" skip_on_invalid_json => true remove_field => [ "json_data" ] } json { source => "log" skip_on_invalid_json => true remove_field => [ "log" ] }
可以像logstash那种skip invalid吗
你给Gohangout 数据源和 Logstash 其实不一样,给Logstash的是一个合法的JSON,给Gohangout的那个不是。 注意双引号前面的转义字符
我回头看了下日志,发现original和我发在issue的不一样了,转义的部分也不同,可能这个编辑栏的问题?不过大佬你说的对,我把合法的json数据源给gohangout可以的。感谢大佬解惑
大佬,以下是我的配置
源数据为 1715420594643 75 h72-pp /var/log/containers/uw5ccfe9598bd2906f42975e4.log {"log":" \"TraceContextHeaderName\": \"trace-id\"\n","stream":"stdout","time":"2024-05-11T09:43:08.892780189Z"}
解析后为 {"@timestamp":"2024-05-13T15:20:50.4024003+08:00","PATH":"/var/log/containers/uw5ccfe9598bd2906f42975e4.log","TIMESTAMP":"1715420594643","json_data":" \\"TraceContextHeaderName\\": \\"trace-id\\"\n\",\"stream\":\"stdout\",\"time\":\"2024-05-11T09:43:08.892780189Z\"}","source_host":"h72-pp","tags":["start_time_parsefail","dateparsefail"]}
预期是想跳过那个log的不规则部分,后续的照常解析,但是用上述json 插件好像无法实现
而用logstash是可以的 { "source_host" => "h72-pp", "stream" => "stdout", "TIMESTAMP" => "1715420594643", "event" => { "original" => "1715420594643 75 h72-pp /var/log/containers/uw5ccfe9598bd2906f42975e4.log {\"log\":\" \\"TraceContextHeaderName\\": \\"ntes-trace-id\\"\n\",\"stream\":\"stdout\",\"time\":\"2024-05-11T09:43:08.892780189Z\"}" }, "@timestamp" => 2024-05-11T09:43:08.892Z, "time" => "2024-05-11T09:43:08.892780189Z", "@version" => "1", "host" => { "name" => "GIH-D-26809" }, "log" => " \"TraceContextHeaderName\": \"trace-id\"\n", "PATH" => "/var/log/containers/uw5ccfe9598bd2906f42975e4.log" } logstash 主要的部分解析是 json { source => "json_data" skip_on_invalid_json => true remove_field => [ "json_data" ] } json { source => "log" skip_on_invalid_json => true remove_field => [ "log" ] }
可以像logstash那种skip invalid吗