Support multiple technologies/logformats on one K8s node

marckamerbeek commented 6 years ago

Hi, We run Kubernetes 1.8.1 on premise and start using fluent-bit as log forwarder to elastic search. There are running 600+ docker containers with all kinds of technologies. Java, Spring boot, Python, NodeJS etc...

How should my config look like? Do I need all kind of different tail inputs? Or just *.log tail input and go through all the filters with all kind of different parsers with regexes (performance wise not a good idea I think)?

Am I the only one with this question or I really have no clue how fluent-bit should work.

edsiper commented 6 years ago

hi @marckamerbeek

there are two approaches;

Fluent Bit 0.12: this is the actual stable version and the filter_kubernetes only allows to take the raw log message (without parsing) or parse it when the message comes as a JSON map. Of course this is not so ideal since not all applications generate logs in JSON format.
Fluent Bit 0.13: this is the actual development version which we aim to release on January 31 if everything is fine with testing and QA. The major feature here is that in your application Pod definition you can "suggest" a parser for the log processor, e.g: if you are going to deploy a Pod which is Apache web server, in the annotations you specify "logging.parser: apache", so the log processor will lookup that parser definition and apply it for the specific log entries for that container.

If you want to experiment with #2 and also help us with testing, please refer to the instructions sent to the Google group this week:

https://groups.google.com/forum/?hl=en#!topic/fluent-bit/Va2gsviVaRE

please let me know how that goes.

marckamerbeek commented 6 years ago

Wow...thats exactly what we had in mind. We have a story on the backlog to build a new plugin that does that. That saves us a lot of work! We are not familiar with building plugins in C.

Ofcourse I want to experiment with that. I'll keep you posted.

And what about multi format log files? Like kubernetes nginx-ingress-controller logs. That has 3 different formats in one log file. Like Access logs, Go and another one. Or am I asking to much now...:-)

edsiper commented 6 years ago

Ofcourse I want to experiment with that. I'll keep you posted.

great :)

And what about multi format log files? Like kubernetes nginx-ingress-controller logs. That has 3 different formats in one log file. Like Access logs, Go and another one. Or am I asking to much now...:-)

oh, I did not think about that specific use case, which can happen on any multi-container Pod. Likely the solutions seems to be around let the Pod suggest multiple parsers and let Fluent Bit process that. It's not that good in terms of performance but something that needs to be solved in some way.

marckamerbeek commented 6 years ago

Well, actually its not a multicontainer pod. Its just one container spitting out all different formats to stdout. We also got some python containers doing the same. Fluentd got that multi format plugin (https://github.com/repeatedly/fluent-plugin-multi-format-parser) which does the thing. But I can imagine that will have some performance issues running through those regexes.

I discussed your solution with some colleague's mentioned in 2. It's nice to have an annotation to define the parser that is needed. But wouldn't it be nice to define your own regex so development teams can determine their own format? Like an adding a regex with an annotation. Or two options. One custom and one predefined.

Like:

apiVersion: v1 kind: Pod metadata: name: apache-logs namespace: default annotations: logging.parser: apache logging.parser.regex: /...your regexp pattern.../ spec: containers: name: apache-logs image: edsiper/apache_logs imagePullPolicy: Always restartPolicy: Always

edsiper commented 6 years ago

Actually I hear that idea recently from a conference attendee, initially I discarded that approach because of performance reasons, but as you said, the benefit of that solutions pays-off the cost of solving the log processing problem for multiple formats.

The cost associated is to compile the regular expression when Fluent Bit filter gather the metadata, of course this is just one time procedure while the regex keeps in the metadata cache.

I will think more about how to implement it properly..

derekperkins commented 6 years ago

Likely the solutions seems to be around let the Pod suggest multiple parsers and let Fluent Bit process that. It's not that good in terms of performance but something that needs to be solved in some way.

Can you just allow for annotations at the container level rather than at the pod level?

edsiper commented 6 years ago

@derekperkins would you please provide an example of a Pod with annotations at container level ?, I cannot find a reference.

derekperkins commented 6 years ago

@edsiper My bad, I thought they were supported. I think there's an opportunity to do the same thing at the pod annotation level. It's valid to nest json inside of an annotation: https://github.com/kubernetes/kubernetes/issues/12226#issuecomment-173481190

What if the current implementation of a string stayed the same for the common use case, but also supported a json mapping?

    logging.parser: >
        { 
          "container1": "apache",
          "container2": "nginx",
          "container3": "mysql-slow"
          ...
        }

I'm not sure if it would be preferable to do it all with the same logging.parser key or whether to support a separate key like logging.parsers or logging.containerParsers, but that shouldn't impact the usability.

derekperkins commented 6 years ago

@edsiper any thoughts on that proposal?

bigkraig commented 6 years ago

In general it would be better to have multiple annotations for each container and to follow a more common annotation style.

Something like this:

fluent-bit.io/parsers/container/apache-app: "apache"
fluent-bit.io/parsers/container/nginx-proxy: "nginx"
fluent-bit.io/parsers/container/dbcontainer: "mysql-slow"

The same thing could leave you open to custom regex parsers through annotations

derekperkins commented 6 years ago

@bigkraig I like that syntax a lot

shahbour commented 6 years ago

I just tested the dev-13 version with kubernetes and set the annotation and got the message parsed as expected.

The only thing missing is multiline support, with docker, the message is split into multiple JSON lines as below

{"log":"2018-03-27 10:53:57.471 [ ] WARN 1 --- [io-8080-exec-16] o.s.c.n.z.f.post.SendResponseFilter : Error while sending response to client: java.io.IOException: Broken pipe\n","stream":"stdout","time":"2018-03-27T10:53:57.471641931Z"} {"log":"2018-03-27 10:53:57.475 [ ] WARN 1 --- [io-8080-exec-16] o.s.c.n.z.filters.post.SendErrorFilter : Error during filtering\n","stream":"stdout","time":"2018-03-27T10:53:57.489294496Z"} {"log":"\n","stream":"stdout","time":"2018-03-27T10:53:57.489428922Z"} {"log":"com.netflix.zuul.exception.ZuulException: Filter threw Exception\n","stream":"stdout","time":"2018-03-27T10:53:57.489455851Z"} {"log":"\u0009at com.netflix.zuul.FilterProcessor.processZuulFilter(FilterProcessor.java:227)\n","stream":"stdout","time":"2018-03-27T10:53:57.489469075Z"} {"log":"\u0009at com.netflix.zuul.FilterProcessor.runFilters(FilterProcessor.java:157)\n","stream":"stdout","time":"2018-03-27T10:53:57.489479188Z"} {"log":"\u0009at com.netflix.zuul.FilterProcessor.postRoute(FilterProcessor.java:92)\n","stream":"stdout","time":"2018-03-27T10:53:57.489562891Z"} {"log":"\u0009at com.netflix.zuul.ZuulRunner.postRoute(ZuulRunner.java:87)\n","stream":"stdout","time":"2018-03-27T10:53:57.489583325Z"} {"log":"\u0009at com.netflix.zuul.http.ZuulServlet.postRoute(ZuulServlet.java:107)\n","stream":"stdout","time":"2018-03-27T10:53:57.489592568Z"} {"log":"\u0009at com.netflix.zuul.http.ZuulServlet.service(ZuulServlet.java:88)\n","stream":"stdout","time":"2018-03-27T10:53:57.489603715Z"} {"log":"\u0009at org.springframework.web.servlet.mvc.ServletWrappingController.handleRequestInternal(ServletWrappingController.java:157)\n","stream":"stdout","time":"2018-03-27T10:53:57.489616038Z"} {"log":"\u0009at org.springframework.cloud.netflix.zuul.web.ZuulController.handleRequest(ZuulController.java:44)\n","stream":"stdout","time":"2018-03-27T10:53:57.489630825Z"} {"log":"\u0009at org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:50)\n","stream":"stdout","time":"2018-03-27T10:53:57.489646611Z"} {"log":"\u0009at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967)\n","stream":"stdout","time":"2018-03-27T10:53:57.489662918Z"} {"log":"\u0009at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901)\n","stream":"stdout","time":"2018-03-27T10:53:57.489676837Z"} {"log":"\u0009at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)\n","stream":"stdout","time":"2018-03-27T10:53:57.489692384Z"} {"log":"\u0009at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861)\n","stream":"stdout","time":"2018-03-27T10:53:57.489707027Z"} {"log":"\u0009at javax.servlet.http.HttpServlet.service(HttpServlet.java:635)\n","stream":"stdout","time":"2018-03-27T10:53:57.489723964Z"} {"log":"\u0009at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)\n","stream":"stdout","time":"2018-03-27T10:53:57.489800884Z"} {"log":"\u0009at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)\n","stream":"stdout","time":"2018-03-27T10:53:57.48983466Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)\n","stream":"stdout","time":"2018-03-27T10:53:57.489854Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.48987003Z"} {"log":"\u0009at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)\n","stream":"stdout","time":"2018-03-27T10:53:57.489885153Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.489901433Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.489919409Z"} {"log":"\u0009at org.springframework.boot.web.filter.ApplicationContextHeaderFilter.doFilterInternal(ApplicationContextHeaderFilter.java:55)\n","stream":"stdout","time":"2018-03-27T10:53:57.489940673Z"} {"log":"\u0009at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n","stream":"stdout","time":"2018-03-27T10:53:57.490022023Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.490038019Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.490047806Z"} {"log":"\u0009at org.springframework.boot.actuate.trace.WebRequestTraceFilter.doFilterInternal(WebRequestTraceFilter.java:110)\n","stream":"stdout","time":"2018-03-27T10:53:57.490056596Z"} {"log":"\u0009at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n","stream":"stdout","time":"2018-03-27T10:53:57.490065633Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.490074429Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.490083049Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)\n","stream":"stdout","time":"2018-03-27T10:53:57.490091815Z"} {"log":"\u0009at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)\n","stream":"stdout","time":"2018-03-27T10:53:57.490100462Z"} {"log":"\u0009at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)\n","stream":"stdout","time":"2018-03-27T10:53:57.490109228Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490118102Z"} {"log":"\u0009at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:114)\n","stream":"stdout","time":"2018-03-27T10:53:57.490126702Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490135615Z"} {"log":"\u0009at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:137)\n","stream":"stdout","time":"2018-03-27T10:53:57.490144248Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490152912Z"} {"log":"\u0009at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:111)\n","stream":"stdout","time":"2018-03-27T10:53:57.490176185Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490186788Z"} {"log":"\u0009at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:170)\n","stream":"stdout","time":"2018-03-27T10:53:57.490198188Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490207912Z"} {"log":"\u0009at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)\n","stream":"stdout","time":"2018-03-27T10:53:57.490216812Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490225492Z"} {"log":"\u0009at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:200)\n","stream":"stdout","time":"2018-03-27T10:53:57.490234488Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490243568Z"} {"log":"\u0009at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:116)\n","stream":"stdout","time":"2018-03-27T10:53:57.490252152Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490260645Z"} {"log":"\u0009at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:64)\n","stream":"stdout","time":"2018-03-27T10:53:57.490269412Z"} {"log":"\u0009at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n","stream":"stdout","time":"2018-03-27T10:53:57.490278055Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490286612Z"} {"log":"\u0009at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:105)\n","stream":"stdout","time":"2018-03-27T10:53:57.490296371Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490305211Z"} {"log":"\u0009at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:56)\n","stream":"stdout","time":"2018-03-27T10:53:57.490313717Z"} {"log":"\u0009at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n","stream":"stdout","time":"2018-03-27T10:53:57.490322651Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n","stream":"stdout","time":"2018-03-27T10:53:57.490331137Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:214)\n","stream":"stdout","time":"2018-03-27T10:53:57.490339701Z"} {"log":"\u0009at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:177)\n","stream":"stdout","time":"2018-03-27T10:53:57.490348217Z"} {"log":"\u0009at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:347)\n","stream":"stdout","time":"2018-03-27T10:53:57.490359037Z"} {"log":"\u0009at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:263)\n","stream":"stdout","time":"2018-03-27T10:53:57.490370597Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.490400321Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.490419281Z"} {"log":"\u0009at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)\n","stream":"stdout","time":"2018-03-27T10:53:57.490429211Z"} {"log":"\u0009at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n","stream":"stdout","time":"2018-03-27T10:53:57.490438157Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.490446861Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.490455481Z"} {"log":"\u0009at org.springframework.security.oauth2.client.filter.OAuth2ClientContextFilter.doFilter(OAuth2ClientContextFilter.java:60)\n","stream":"stdout","time":"2018-03-27T10:53:57.490465871Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.490482497Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.490558196Z"} {"log":"\u0009at org.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:108)\n","stream":"stdout","time":"2018-03-27T10:53:57.49056896Z"} {"log":"\u0009at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n","stream":"stdout","time":"2018-03-27T10:53:57.490578133Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.490587096Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.490595906Z"} {"log":"\u0009at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:81)\n","stream":"stdout","time":"2018-03-27T10:53:57.49060457Z"} {"log":"\u0009at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n","stream":"stdout","time":"2018-03-27T10:53:57.490613323Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.490621933Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.490630713Z"} {"log":"\u0009at org.springframework.session.web.http.SessionRepositoryFilter.doFilterInternal(SessionRepositoryFilter.java:167)\n","stream":"stdout","time":"2018-03-27T10:53:57.490639456Z"} {"log":"\u0009at org.springframework.session.web.http.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:80)\n","stream":"stdout","time":"2018-03-27T10:53:57.490648323Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.490658946Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.490668026Z"} {"log":"\u0009at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:197)\n","stream":"stdout","time":"2018-03-27T10:53:57.490676563Z"} {"log":"\u0009at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n","stream":"stdout","time":"2018-03-27T10:53:57.490685246Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.49070542Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.490715402Z"} {"log":"\u0009at org.springframework.boot.actuate.autoconfigure.MetricsFilter.doFilterInternal(MetricsFilter.java:106)\n","stream":"stdout","time":"2018-03-27T10:53:57.490724339Z"} {"log":"\u0009at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)\n","stream":"stdout","time":"2018-03-27T10:53:57.490732969Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.490741632Z"} {"log":"\u0009at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)\n","stream":"stdout","time":"2018-03-27T10:53:57.490750202Z"} {"log":"\u0009at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:199)\n","stream":"stdout","time":"2018-03-27T10:53:57.490787159Z"} {"log":"\u0009at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)\n","stream":"stdout","time":"2018-03-27T10:53:57.491061331Z"} {"log":"\u0009at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:504)\n","stream":"stdout","time":"2018-03-27T10:53:57.491079311Z"} {"log":"\u0009at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)\n","stream":"stdout","time":"2018-03-27T10:53:57.491088601Z"} {"log":"\u0009at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)\n","stream":"stdout","time":"2018-03-27T10:53:57.491097231Z"} {"log":"\u0009at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)\n","stream":"stdout","time":"2018-03-27T10:53:57.491105898Z"} {"log":"\u0009at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)\n","stream":"stdout","time":"2018-03-27T10:53:57.491116028Z"} {"log":"\u0009at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:803)\n","stream":"stdout","time":"2018-03-27T10:53:57.491131664Z"} {"log":"\u0009at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)\n","stream":"stdout","time":"2018-03-27T10:53:57.49114518Z"} {"log":"\u0009at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:790)\n","stream":"stdout","time":"2018-03-27T10:53:57.49115452Z"} {"log":"\u0009at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1459)\n","stream":"stdout","time":"2018-03-27T10:53:57.491163343Z"} {"log":"\u0009at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)\n","stream":"stdout","time":"2018-03-27T10:53:57.491171993Z"} {"log":"\u0009at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n","stream":"stdout","time":"2018-03-27T10:53:57.491182987Z"} {"log":"\u0009at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n","stream":"stdout","time":"2018-03-27T10:53:57.491192303Z"} {"log":"\u0009at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)\n","stream":"stdout","time":"2018-03-27T10:53:57.491209943Z"} {"log":"\u0009at java.lang.Thread.run(Thread.java:748)\n","stream":"stdout","time":"2018-03-27T10:53:57.491221187Z"} {"log":"Caused by: java.lang.reflect.UndeclaredThrowableException: null\n","stream":"stdout","time":"2018-03-27T10:53:57.49122983Z"} {"log":"\u0009at org.springframework.util.ReflectionUtils.rethrowRuntimeException(ReflectionUtils.java:317)\n","stream":"stdout","time":"2018-03-27T10:53:57.49123867Z"} {"log":"\u0009at org.springframework.cloud.netflix.zuul.filters.post.SendResponseFilter.run(SendResponseFilter.java:120)\n","stream":"stdout","time":"2018-03-27T10:53:57.49124738Z"} {"log":"\u0009at com.netflix.zuul.ZuulFilter.runFilter(ZuulFilter.java:112)\n","stream":"stdout","time":"2018-03-27T10:53:57.491256187Z"} {"log":"\u0009at com.netflix.zuul.FilterProcessor.processZuulFilter(FilterProcessor.java:193)\n","stream":"stdout","time":"2018-03-27T10:53:57.491276997Z"} {"log":"\u0009... 103 common frames omitted\n","stream":"stdout","time":"2018-03-27T10:53:57.4912869Z"} {"log":"Caused by: org.apache.catalina.connector.ClientAbortException: java.io.IOException: Broken pipe\n","stream":"stdout","time":"2018-03-27T10:53:57.491295757Z"} {"log":"\u0009at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:356)\n","stream":"stdout","time":"2018-03-27T10:53:57.491304543Z"} {"log":"\u0009at org.apache.catalina.connector.OutputBuffer.flushByteBuffer(OutputBuffer.java:815)\n","stream":"stdout","time":"2018-03-27T10:53:57.491313117Z"} {"log":"\u0009at org.apache.catalina.connector.OutputBuffer.append(OutputBuffer.java:720)\n","stream":"stdout","time":"2018-03-27T10:53:57.491327273Z"} {"log":"\u0009at org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:391)\n","stream":"stdout","time":"2018-03-27T10:53:57.491335883Z"} {"log":"\u0009at org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:369)\n","stream":"stdout","time":"2018-03-27T10:53:57.491344529Z"} {"log":"\u0009at org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:96)\n","stream":"stdout","time":"2018-03-27T10:53:57.491359786Z"} {"log":"\u0009at org.springframework.session.web.http.OnCommittedResponseWrapper$SaveContextServletOutputStream.write(OnCommittedResponseWrapper.java:560)\n","stream":"stdout","time":"2018-03-27T10:53:57.491377352Z"} {"log":"\u0009at org.springframework.security.web.util.OnCommittedResponseWrapper$SaveContextServletOutputStream.write(OnCommittedResponseWrapper.java:639)\n","stream":"stdout","time":"2018-03-27T10:53:57.491388166Z"} {"log":"\u0009at org.springframework.cloud.netflix.zuul.filters.post.SendResponseFilter.writeResponse(SendResponseFilter.java:219)\n","stream":"stdout","time":"2018-03-27T10:53:57.491397459Z"} {"log":"\u0009at org.springframework.cloud.netflix.zuul.filters.post.SendResponseFilter.writeResponse(SendResponseFilter.java:188)\n","stream":"stdout","time":"2018-03-27T10:53:57.491406322Z"} {"log":"\u0009at org.springframework.cloud.netflix.zuul.filters.post.SendResponseFilter.run(SendResponseFilter.java:117)\n","stream":"stdout","time":"2018-03-27T10:53:57.491415336Z"} {"log":"\u0009... 105 common frames omitted\n","stream":"stdout","time":"2018-03-27T10:53:57.491424076Z"} {"log":"Caused by: java.io.IOException: Broken pipe\n","stream":"stdout","time":"2018-03-27T10:53:57.491432296Z"} {"log":"\u0009at sun.nio.ch.FileDispatcherImpl.write0(Native Method)\n","stream":"stdout","time":"2018-03-27T10:53:57.491440286Z"} {"log":"\u0009at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)\n","stream":"stdout","time":"2018-03-27T10:53:57.491452756Z"} {"log":"\u0009at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)\n","stream":"stdout","time":"2018-03-27T10:53:57.491461399Z"} {"log":"\u0009at sun.nio.ch.IOUtil.write(IOUtil.java:65)\n","stream":"stdout","time":"2018-03-27T10:53:57.491469816Z"} {"log":"\u0009at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)\n","stream":"stdout","time":"2018-03-27T10:53:57.491478086Z"} {"log":"\u0009at org.apache.tomcat.util.net.NioChannel.write(NioChannel.java:134)\n","stream":"stdout","time":"2018-03-27T10:53:57.491486422Z"} {"log":"\u0009at org.apache.tomcat.util.net.NioBlockingSelector.write(NioBlockingSelector.java:101)\n","stream":"stdout","time":"2018-03-27T10:53:57.491494719Z"} {"log":"\u0009at org.apache.tomcat.util.net.NioSelectorPool.write(NioSelectorPool.java:157)\n","stream":"stdout","time":"2018-03-27T10:53:57.491503502Z"} {"log":"\u0009at org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper.doWrite(NioEndpoint.java:1267)\n","stream":"stdout","time":"2018-03-27T10:53:57.491518416Z"} {"log":"\u0009at org.apache.tomcat.util.net.SocketWrapperBase.doWrite(SocketWrapperBase.java:670)\n","stream":"stdout","time":"2018-03-27T10:53:57.491583948Z"} {"log":"\u0009at org.apache.tomcat.util.net.SocketWrapperBase.writeBlocking(SocketWrapperBase.java:450)\n","stream":"stdout","time":"2018-03-27T10:53:57.491624035Z"} {"log":"\u0009at org.apache.tomcat.util.net.SocketWrapperBase.write(SocketWrapperBase.java:388)\n","stream":"stdout","time":"2018-03-27T10:53:57.491638898Z"} {"log":"\u0009at org.apache.coyote.http11.Http11OutputBuffer$SocketOutputBuffer.doWrite(Http11OutputBuffer.java:623)\n","stream":"stdout","time":"2018-03-27T10:53:57.491648205Z"} {"log":"\u0009at org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:123)\n","stream":"stdout","time":"2018-03-27T10:53:57.491657245Z"} {"log":"\u0009at org.apache.coyote.http11.Http11OutputBuffer.doWrite(Http11OutputBuffer.java:225)\n","stream":"stdout","time":"2018-03-27T10:53:57.491665808Z"} {"log":"\u0009at org.apache.coyote.Response.doWrite(Response.java:541)\n","stream":"stdout","time":"2018-03-27T10:53:57.491674481Z"} {"log":"\u0009at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:351)\n","stream":"stdout","time":"2018-03-27T10:53:57.491682788Z"} {"log":"\u0009... 115 common frames omitted\n","stream":"stdout","time":"2018-03-27T10:53:57.491691508Z"} {"log":"\n","stream":"stdout","time":"2018-03-27T10:53:57.491699668Z"}

Is it possible to join all messages if they don't confirm with regex used or what method we can follow if implemented ?

StevenACoffman commented 6 years ago

@shahbour For a java application please try specifying the annotation as "logging.parser: java_multiline" and see if this works.

shahbour commented 6 years ago

@StevenACoffman will this work even we are receiving them as multiple lines from docker?

StevenACoffman commented 6 years ago

@shahbour It depends on whether the kubernetes filter parser is operating on a stream, or on each discrete message. I'd be interested in seeing the result.

shahbour commented 6 years ago

As I am using my own regex (custom logging) so I changed it to be like java_multiline

[PARSER]
    Name        java_multiline
    Format      regex
    Regex       /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<thread>.*)\] (?<level>[^\s]+)(?<message>.*)/
    Time_Key    time
Time_Format %Y-%m-%d %H:%M:%S

  [PARSER]
      Name        springboot
      Format      regex
      Regex       /^(?<date>[0-9]+-[0-9]+-[0-9]+\s+[0-9]+:[0-9]+:[0-9]+.[0-9]+)\s+\[(?<user_name>.*)\]\s+(?<log_level>[Aa]lert|ALERT|[Tt]race|TRACE|[Dd]ebug|DEBUG|[Nn]otice|NOTICE|[Ii]nfo|INFO|[Ww]arn?(?:ing)?|WARN?(?:ING)?|[Ee]rr?(?:or)?|ERR?(?:OR)?|[Cc]rit?(?:ical)?|CRIT?(?:ICAL)?|[Ff]atal|FATAL|[Ss]evere|SEVERE|EMERG(?:ENCY)?|[Ee]merg(?:ency)?)\s+(?<pid>[0-9]+)\s+---\s+\[(?<thread>.*)\]\s+(?<class_name>.*)\s+:\s+(?<message>.*)$/
      Time_Key    date
      Time_Format %Y-%m-%d

Still, I got each message by itself on elasticsearch

edsiper commented 6 years ago

FYI: Multiline inside a JSON key/value is not supported, only at raw text level for now.

marckamerbeek commented 6 years ago

I've been struggling with java multiline in combination with the json from docker logs. That json needs to be parsed with a json parser. But with "Multiline On" in the tail plugin, the json parser will be ommited.

My solution was to let my Java (Springboot) log in json format. Multiline will work out of the box since its wrapped in json. To let Springboot log in json:

Add a logback.xml with the following XML:

<configuration>
    <appender name="consoleAppender"
        class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="net.logstash.logback.encoder.LogstashEncoder">
            <fieldNames>
                <timestamp>application_timestamp</timestamp>
            </fieldNames>
        </encoder>
    </appender>
    <logger name="jsonLogger" additivity="false" level="DEBUG">
        <appender-ref ref="consoleAppender" />
    </logger>
    <root level="INFO">
        <appender-ref ref="consoleAppender" />
    </root>
</configuration>

And add the following dependency:

<dependency>
    <groupId>net.logstash.logback</groupId>
    <artifactId>logstash-logback-encoder</artifactId>
    <version>4.7</version>
</dependency>

To make sure your log in Kibana is dealing the line endings correctly of the stacktrace add "Decodedas String" is added to your configuration

Configuration:

    [SERVICE]
        Flush        1
        Daemon       Off
        Log_Level    debug
        Parsers_File parsers_custom.conf

    [INPUT]
        Buffer_Chunk_Size 400k
        Buffer_Max_Size 5MB
        DB /var/log/containers/fluent-bit-online-tst.db
        Mem_Buf_Limit 5MB
        Name tail
        Parser docker
        Path /var/log/containers/*.log
        Refresh_Interval 5
        Tag kube.spring.*

    [FILTER]
        K8S-Logging.Parser Off
        Kube_URL https://${KUBERNETES_SERVICE_HOST}:443
        Match kube.*
        Merge_JSON_Log On
        Name kubernetes
        tls.verify Off

    [OUTPUT]
        Name  es
        Match *
        Host  xxx.xxx.xxx.xxx
        Port  9200
        Logstash_Format On
        Logstash_Prefix  kubernetes-2-tst
        Include_Tag_Key true
        Retry_Limit False

    [PARSER]
        Decode_Field_As escaped log
        Format json
        Name docker
        Time_Format %Y-%m-%dT%H:%M:%S %z
        Time_Key time

shahbour commented 6 years ago

Thanks for the hint, I just updated the logback-spring.xml so it changes based on environment

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <include resource="org/springframework/boot/logging/logback/defaults.xml" />
    <include resource="org/springframework/boot/logging/logback/console-appender.xml" />
    <appender name="consoleAppender" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="net.logstash.logback.encoder.LogstashEncoder">
            <fieldNames>
                <timestamp>application_timestamp</timestamp>
            </fieldNames>
        </encoder>
    </appender>
    <root level="INFO">
        <springProfile name="!kubernetes">
            <appender-ref ref="CONSOLE" />
        </springProfile>
        <springProfile name="kubernetes">
            <appender-ref ref="consoleAppender"/>
        </springProfile>
    </root>
</configuration>

marckamerbeek commented 6 years ago

Haha...well thats what we discussed yesterday here to implement. That makes the log human readable locally. Thanks for the code example!

One more note: I override the timestamp with "application_timestamp" (See logback xml) due to a duplicate key exception in Elastic. Fluent bit adds the time key aswell.

edsiper commented 6 years ago

@marckamerbeek with Fluent Bit 0.13 (to be released next Monday) you can workaround the problem setting up specific Pod annotations describing the pre-defined parser to be used.

xdays commented 6 years ago

Is there any doc on this awesome feature?

I enabled it for our nginx ingress controller, but it doesn't work.

$ kubectl -n ingress-nginx describe pod nginx-ingress-controller-b878b75bf-cmqzs
Name:           nginx-ingress-controller-b878b75bf-cmqzs
Namespace:      ingress-nginx
Node:           ip-10-10-1-231.us-west-2.compute.internal/10.10.1.231
Start Time:     Tue, 14 Aug 2018 18:39:49 +0800
Labels:         app=ingress-nginx
                pod-template-hash=643463169
Annotations:    fluentbit.io/parser=nginx
                prometheus.io/port=10254
                prometheus.io/scrape=true

I install fluent-bit by:

$ helm --kube-context $CONTEXT install --namespace $NAMESPACE --name fluent-bit stable/fluent-bit --set backend.type=es,backend.es.host=elasticsearch-client

the installed version is 0.13.0

$ kubectl -n logging describe pod fluent-bit-lwzxp
Name:           fluent-bit-lwzxp
Namespace:      logging
Node:           ip-10-10-1-231.us-west-2.compute.internal/10.10.1.231
Start Time:     Tue, 14 Aug 2018 17:24:58 +0800
Labels:         app=fluent-bit-fluent-bit
                controller-revision-hash=2471733741
                pod-template-generation=1
                release=fluent-bit
Annotations:    checksum/config=d7eceedd9eca0f295eb455f0b1b10f6a3ae253b88513025a0f67613873f92d36
Status:         Running
IP:             10.10.1.25
Controlled By:  DaemonSet/fluent-bit
Containers:
  fluent-bit:
    Container ID:   docker://34d4294d8b404745f171789adc00846970102af5328aa97ead72415ea7f34f1b
    Image:          fluent/fluent-bit:0.13.0
    Image ID:       docker-pullable://fluent/fluent-bit@sha256:313b51885c3524cca9c7fa2c68ea3afce47fd4d22ab7820334cc4cf4764de6f3
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Tue, 14 Aug 2018 17:25:02 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  100Mi
    Requests:
      cpu:        100m
      memory:     100Mi
    Environment:  <none>

Thanks.

xdays commented 6 years ago

@edsiper cloud you help me?

marckamerbeek commented 6 years ago

I would be glad to help you, but I'm missing your fluentbit config. You have to add the kubernetes filter and turn on the parser:

K8S-Logging.Parser: On

xdays commented 6 years ago

@marckamerbeek thanks for you help.

I turn on the parser, but still no lucky. This is my fluent-bit configmap

apiVersion: v1
data:
  fluent-bit.conf: |-
    [SERVICE]
        Flush        1
        Daemon       Off
        Log_Level    info
        Parsers_File parsers.conf

    [INPUT]
        Name             tail
        Path             /var/log/containers/*.log
        Parser           docker
        Tag              kube.*
        Refresh_Interval 5
        Mem_Buf_Limit    5MB
        Skip_Long_Lines  On

    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        K8S-Logging.Parser  On

    [OUTPUT]
        Name  es
        Match *
        Host  elasticsearch-client
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Logstash_Prefix kubernetes_cluster
  parsers.conf: ""
kind: ConfigMap
metadata:
  creationTimestamp: 2018-08-14T09:24:58Z
  labels:
    app: fluent-bit-fluent-bit
    chart: fluent-bit-0.8.0
    heritage: Tiller
    release: fluent-bit
  name: fluent-bit-fluent-bit-config
  namespace: logging
  resourceVersion: "13114430"
  selfLink: /api/v1/namespaces/logging/configmaps/fluent-bit-fluent-bit-config
  uid: e99029eb-9fa3-11e8-b731-02107822efe8

or any document on this is appreciated.

marckamerbeek commented 6 years ago

You must define a parser with the name "nginx" as you mentioned in your annotation.

Be aware that parsing a nginx log could be quite expensive with a regex. My nginx-ingress-controller is logging in json. This can be forwarded to elastic one on one. You can configure your nginx to log in json.

This is my configuration (from a helm chart, but the values are the same):


inputs:
  - Name: tail
    Path: "/var/log/containers/*.log"
    Exclude_Path: "*kube-system*.log"
    Refresh_Interval: "5"
    Mem_Buf_Limit: "25MB"
    Tag: "kube.services.*"
    Parser: docker
    Buffer_Max_Size: "25MB"
    Buffer_Chunk_Size: "1MB"
    Skip_Long_Lines: "On"
    DB: "/var/log/containers/fluent-bit.db"
## Special tail for nginx logging since *kube-system* is excluded in the tail above. (nginx is part of kube-system)
  - Name: tail
    Path: "/var/log/containers/*nginx*.log"
    Refresh_Interval: "5"
    Mem_Buf_Limit: "25MB"
    Tag: "kube.ingress.*"
    Parser: docker
    Buffer_Max_Size: "25MB"
    Buffer_Chunk_Size: "1MB"
    Skip_Long_Lines: "On"
    DB: "/var/log/containers/fluent-bit-nginx.db"

# Filters configuration
# This kubernetes filter is able to pickup an annotation from your pod:
#   spec:
#    template:
#      metadata:
#        annotations:
#          fluentbit.io/parser: "json"
#
# This annotation suggests fluent-bit to use (in this case) the json parser (defined below)

filters: 
  - Name: "kubernetes"
    Match: "kube.*"
    tls.verify: "Off"
    Kube_URL: "https://${KUBERNETES_SERVICE_HOST}:443"
    Merge_JSON_Log: "On"
    K8S-Logging.Parser: "On"
    K8S-Logging.exclude: "True"

outputs:
  - Name: "es"
    Match: kube.services.*
    Host:  "x.x.x.x"
    Port: "9200"
    Logstash_Format: "On"
    Logstash_Prefix: "k8s-services-prd"
    Logstash_DateFormat: "%G.%V"
    Include_Tag_Key: "true"
    Retry_Limit: "False"
  - Name: "es"
    Match: kube.ingress.*
    Host:  "x.x.x.x"
    Port: "9200"
    Logstash_Format: "On"
    Logstash_Prefix: "k8s-ingress-prd"
    Logstash_DateFormat: "%G.%V"
    Include_Tag_Key: "true"
    Retry_Limit: "False"

parsers:
  custom: 
    - Name: json
      Format: json
      Time_Key: time
      Time_Format: "%Y-%m-%dT%H:%M:%S.%L"   
      Decode_Field_As: "escaped log" 
    - Name: docker
      Format: json
      Time_Key: time
      Time_Format: "%Y-%m-%dT%H:%M:%S.%L"  
      Time_Keep: On 
      Decode_Field_As: "escaped log"

xdays commented 6 years ago

@marckamerbeek Tanks for your suggesion, I will try that later. for K8S-Logging.Parser, I still have problem, here's my fluent-bit config:

fluent-bit.conf:

[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info
    Parsers_File parsers.conf

[INPUT]
    Name             tail
    Path             /var/log/containers/*.log
    Parser           docker
    Tag              kube.*
    Refresh_Interval 5
    Mem_Buf_Limit    5MB
    Skip_Long_Lines  On

[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc:443
    Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
    # Merge_Log           On
    K8S-Logging.Parser  On

[OUTPUT]
    Name  es
    Match *
    Host  elasticsearch-client
    Port  9200
    Logstash_Format On
    Retry_Limit False
    Time_Key @datetime
    Type  flb_type
    Logstash_Prefix kubernetes_cluster

parsers.conf

[PARSER]
    Name   apache
    Format regex
    Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   apache2
    Format regex
    Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   apache_error
    Format regex
    Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

[PARSER]
    Name   nginx
    Format regex
    Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   json
    Format json
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name         docker
    Format       json
    Time_Key     time
    Time_Format  %Y-%m-%dT%H:%M:%S.%L
    Time_Keep    On
    # Command      |  Decoder | Field | Optional Action
    # =============|==================|=================
    Decode_Field_As   escaped    log

[PARSER]
    Name        docker-daemon
    Format      regex
    Regex       time="(?<time>[^ ]*)" level=(?<level>[^ ]*) msg="(?<msg>[^ ].*)"
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On

[PARSER]
    Name        syslog-rfc5424
    Format      regex
    Regex       ^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*)\]|-)) (?<message>.+)$
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On

[PARSER]
    Name        syslog-rfc3164-local
    Format      regex
    Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
    Time_Key    time
    Time_Format %b %d %H:%M:%S
    Time_Keep   On

[PARSER]
    Name        syslog-rfc3164
    Format      regex
    Regex       /^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
    Time_Key    time
    Time_Format %b %d %H:%M:%S
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On

[PARSER]
    Name    mongodb
    Format  regex
    Regex   ^(?<time>[^ ]*)\s+(?<severity>\w)\s+(?<component>[^ ]+)\s+\[(?<context>[^\]]+)]\s+(?<message>.*?) *(?<ms>(\d+))?(:?ms)?$
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On
    Time_Key time

[PARSER]
    Name    kube-custom
    Format  regex
    Regex   var\.log\.containers\.(?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$

[PARSER]
    Name    filter-kube-test
    Format  regex
    Regex   .*kubernetes.(?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$

and here's a sample pod:

apiVersion: v1
kind: Pod
metadata:
  name: apache-logs
  annotations:
    fluentbit.io/parser: apache
  labels:
    app: apache-logs
spec:
  containers:
  - name: apache-logs
    image: edsiper/apache_logs
    imagePullPolicy: Always
  restartPolicy: Always

and no log is parsed as shown from kibana:

{
  "_index": "kubernetes_cluster-2018.08.28",
  "_type": "flb_type",
  "_id": "VRJ0f2UBf6UOjByGBs65",
  "_version": 1,
  "_score": null,
  "_source": {
    "@datetime": "2018-08-28T07:33:54.808Z",
    "log": "245.152.66.117 - - [28/Aug/2018: 7:33:54 +0000] \"GET /carbamate HTTP/1.0\" 204 2216\n",
    "stream": "stdout",
    "time": "2018-08-28T07:33:54.808729434Z",
    "kubernetes": {
      "pod_name": "apache-logs",
      "namespace_name": "default",
      "pod_id": "64eb5068-aa93-11e8-b731-02107822efe8",
      "labels": {
        "app": "apache-logs"
      },
      "annotations": {
        "fluentbit_io/parser": "apache"
      },
      "host": "ip-10-10-1-231.us-west-2.compute.internal",
      "container_name": "apache-logs",
      "docker_id": "7344df5240492d1cedcfc5424d80617c20679003cb9990312b62512e2d333303"
    }
  },
  "fields": {
    "time": [
      "2018-08-28T07:33:54.808Z"
    ],
    "@datetime": [
      "2018-08-28T07:33:54.808Z"
    ]
  },
  "highlight": {
    "kubernetes.labels.app": [
      "@kibana-highlighted-field@apache@/kibana-highlighted-field@-@kibana-highlighted-field@logs@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1535441634808
  ]
}

It's really bad without document, I'm about give up this week.

marckamerbeek commented 6 years ago

If I use your regex in a regex tester and take the log line in your Kibana output, its not matching. I checked this on regex101.com

Have you checked your fluent-bit logging (/var/log/containers) on parsing errors?

Fluentbit has a default apache parser: https://fluentbit.io/documentation/current/parser/

Can't you use this one first? Or take a very simple regex that at least matches something?

xdays commented 6 years ago

The log field from kibana is escaped, I can parse raw log:

214.23.247.34 - - [29/Aug/2018: 5:26:52 +0000] "GET /carpetbags HTTP/1.0" 500 2216

with:

[PARSER]
    Name   apache
    Format regex
    Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

Does your K8S-Logging.Parser On work as expect? Then I may miss some important config.

edsiper commented 6 years ago

Please check the following documentation, if possible try the test case described at bottom:

https://docs.fluentbit.io/manual/filter/kubernetes#kubernetes-annotations

shahbour commented 6 years ago

I found a nice solution for the multiple line java exception 1- Convert the new line in exception to \u000d (new line in unicode) , for a spring boot application this can easily be done by setting the pattern to

%clr(%d{yyyy-MM-dd HH:mm:ss.SSS}){faint} %clr(%5p) %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%replace(%wEx){'\n','\u000d'}%nopex%n

the important and only new part is this %replace(%wEx){'\n','\u000d'}%nopex% where we did replace \n

For other Java application, you can follow this link

2- In Parser, I used escaped_utf8 decoding like below

    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On
        # Command      |  Decoder | Field | Optional Action
        # =============|==================|=================
        Decode_Field_As   escaped_utf8    log

Now exception is logged perfectly in elastic search with new line and one entry .

marckamerbeek commented 6 years ago

@shahbour Nice job finding this creative solution!

I changed all the springboot logging to json so this will not help me, but certainly other people! Be aware that parsing Java logging with regex is more cpu expensive!

edsiper commented 6 years ago

Multiple formats already supported (still need improvement in multiline stuff).

Closing this thread as the initial issue is fixed.

shahbour commented 5 years ago

@edsiper We are talking here about multiline on docker json logs , can you please show us how it is supported ?

fluent / fluent-bit

Support multiple technologies/logformats on one K8s node #463