spring-cloud / spring-cloud-dataflow

A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
https://dataflow.spring.io
Apache License 2.0
1.11k stars 581 forks source link

scdf 2.2.1 on k8s random connection time-out skipper #3552

Closed eskuai closed 4 years ago

eskuai commented 4 years ago

Description: As i user, i can see, that scdf2 show a "warn" , incluing a "big" stack trace info, about connection timeout from scdf2 to skipper.

Release versions: scdf 2.2.1 skipper 2.1.2

Custom apps: No stream o task related

Steps to reproduce: I just watch logs info, and scdf shows a WARN including a stacktrace with a connection to skipper problema.

No restart scdf2 neither skipper on k8s dashboard freeze for a seconds, and succedely it starts work again ...

2019-10-15 16:00:15.487  WARN 1 --- [nio-8080-exec-7] o.s.c.d.s.controller.AboutController     : Skipper Server is not accessible
org.springframework.web.client.ResourceAccessException: I/O error on GET request for "http://10.108.23.60/api/about": Connect to 10.108.23.60:80 [/10.108.23.60] failed: Connection timed out (Connection timed out); nested exception is org.apache.http.conn.HttpHostConnectException: Connect to 10.108.23.60:80 [/10.108.23.60] failed: Connection timed out (Connection timed out)
        at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:744)
        at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:670)
        at org.springframework.web.client.RestTemplate.getForObject(RestTemplate.java:311)
        at org.springframework.cloud.skipper.client.DefaultSkipperClient.info(DefaultSkipperClient.java:126)
        at org.springframework.cloud.dataflow.server.stream.SkipperStreamDeployer.environmentInfo(SkipperStreamDeployer.java:519)
        at org.springframework.cloud.dataflow.server.controller.AboutController.getAboutResource(AboutController.java:158)

Screenshots: Where applicable, add screenshots to help explain your problem.

2019-10-15 15:59:26.473 ERROR 1 --- [nio-8080-exec-2] o.s.c.d.s.c.RestControllerAdvice         : Caught exception while handling a request
org.springframework.web.client.ResourceAccessException: I/O error on GET request for "http://10.108.23.60/api/release/status/stream-inmark": Connect to 10.108.23.60:80 [/10.108.23.60] failed: Connection timed out (Connection timed out); nested exception is org.apache.http.conn.HttpHostConnectException: Connect to 10.108.23.60:80 [/10.108.23.60] failed: Connection timed out (Connection timed out)
        at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:744)
        at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:691)
        at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:618)
        at org.springframework.cloud.skipper.client.DefaultSkipperClient.status(DefaultSkipperClient.java:137)
        at org.springframework.cloud.dataflow.server.stream.SkipperStreamDeployer.getStreamDeploymentState(SkipperStreamDeployer.java:170)
        at org.springframework.cloud.dataflow.server.stream.SkipperStreamDeployer.streamsStates(SkipperStreamDeployer.java:159)
        at org.springframework.cloud.dataflow.server.service.impl.DefaultStreamService.state(DefaultStreamService.java:324)
        at org.springframework.cloud.dataflow.server.service.impl.DefaultStreamService$$FastClassBySpringCGLIB$$89697014.invoke(<generated>)
        at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
        at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:749)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:295)
        at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688)
        at org.springframework.cloud.dataflow.server.service.impl.DefaultStreamService$$EnhancerBySpringCGLIB$$929a8749.state(<generated>)
        at org.springframework.cloud.dataflow.server.controller.StreamDefinitionController$Assembler.<init>(StreamDefinitionController.java:191)
        at org.springframework.cloud.dataflow.server.controller.StreamDefinitionController.save(StreamDefinitionController.java:122)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)
        at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)
        at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:104)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:892)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797)
        at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1039)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:942)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1005)
        at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:908)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:660)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:882)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:103)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.boot.actuate.web.trace.servlet.HttpTraceFilter.doFilterInternal(HttpTraceFilter.java:88)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:320)
        at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)
        at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:119)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:137)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:111)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:170)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:200)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilterInternal(BasicAuthenticationFilter.java:215)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilterInternal(BasicAuthenticationFilter.java:215)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.oauth2.provider.authentication.OAuth2AuthenticationProcessingFilter.doFilter(OAuth2AuthenticationProcessingFilter.java:176)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:116)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:74)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:105)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:56)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:215)
        at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:178)
        at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:357)
        at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:270)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.security.oauth2.client.filter.OAuth2ClientContextFilter.doFilter(OAuth2ClientContextFilter.java:60)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:92)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:93)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.filterAndRecordMetrics(WebMvcMetricsFilter.java:114)
        at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:104)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:490)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:408)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:853)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1587)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to 10.108.23.60:80 [/10.108.23.60] failed: Connection timed out (Connection timed out)
        at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156)
        at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
        at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
        at org.springframework.http.client.HttpComponentsClientHttpRequest.executeInternal(HttpComponentsClientHttpRequest.java:87)
        at org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(AbstractBufferingClientHttpRequest.java:48)
        at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:53)
        at org.springframework.http.client.InterceptingClientHttpRequest$InterceptingRequestExecution.execute(InterceptingClientHttpRequest.java:108)
        at org.springframework.cloud.common.security.support.OAuth2AccessTokenProvidingClientHttpRequestInterceptor.intercept(OAuth2AccessTokenProvidingClientHttpRequestInterceptor.java:65)
        at org.springframework.http.client.InterceptingClientHttpRequest$InterceptingRequestExecution.execute(InterceptingClientHttpRequest.java:92)
        at org.springframework.boot.actuate.metrics.web.client.MetricsClientHttpRequestInterceptor.intercept(MetricsClientHttpRequestInterceptor.java:65)
        at org.springframework.http.client.InterceptingClientHttpRequest$InterceptingRequestExecution.execute(InterceptingClientHttpRequest.java:92)
        at org.springframework.http.client.InterceptingClientHttpRequest.executeInternal(InterceptingClientHttpRequest.java:76)
        at org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(AbstractBufferingClientHttpRequest.java:48)
        at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:53)
        at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:735)
        ... 126 common frames omitted
Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75)
        at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
        ... 147 common frames omitted
2019-10-15 16:00:02.712  WARN 1 --- [nio-8080-exec-8] i.f.k.client.internal.VersionUsageUtils  : The client is using resource type 'cronjobs' with unstable version 'v1beta1'
2019-10-15 16:00:15.487  WARN 1 --- [nio-8080-exec-7] o.s.c.d.s.controller.AboutController     : Skipper Server is not accessible
org.springframework.web.client.ResourceAccessException: I/O error on GET request for "http://10.108.23.60/api/about": Connect to 10.108.23.60:80 [/10.108.23.60] failed: Connection timed out (Connection timed out); nested exception is org.apache.http.conn.HttpHostConnectException: Connect to 10.108.23.60:80 [/10.108.23.60] failed: Connection timed out (Connection timed out)
        at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:744)
        at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:670)
        at org.springframework.web.client.RestTemplate.getForObject(RestTemplate.java:311)
        at org.springframework.cloud.skipper.client.DefaultSkipperClient.info(DefaultSkipperClient.java:126)
        at org.springframework.cloud.dataflow.server.stream.SkipperStreamDeployer.environmentInfo(SkipperStreamDeployer.java:519)
        at org.springframework.cloud.dataflow.server.controller.AboutController.getAboutResource(AboutController.java:158)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)
        at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)
        at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:104)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:892)
        at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797)
        at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1039)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:942)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1005)
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:897)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:634)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:882)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:103)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.boot.actuate.web.trace.servlet.HttpTraceFilter.doFilterInternal(HttpTraceFilter.java:88)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:320)
        at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)
        at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:119)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:137)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:111)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:170)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:200)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilterInternal(BasicAuthenticationFilter.java:158)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilterInternal(BasicAuthenticationFilter.java:158)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.oauth2.provider.authentication.OAuth2AuthenticationProcessingFilter.doFilter(OAuth2AuthenticationProcessingFilter.java:176)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:116)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:74)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:105)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:56)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:334)
        at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:215)
        at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:178)
        at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:357)
        at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:270)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.security.oauth2.client.filter.OAuth2ClientContextFilter.doFilter(OAuth2ClientContextFilter.java:60)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:92)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:93)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.filterAndRecordMetrics(WebMvcMetricsFilter.java:114)
        at org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:104)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:109)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:490)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:408)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:853)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1587)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to 10.108.23.60:80 [/10.108.23.60] failed: Connection timed out (Connection timed out)
        at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156)
        at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
        at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
        at org.springframework.http.client.HttpComponentsClientHttpRequest.executeInternal(HttpComponentsClientHttpRequest.java:87)
        at org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(AbstractBufferingClientHttpRequest.java:48)
        at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:53)
        at org.springframework.http.client.InterceptingClientHttpRequest$InterceptingRequestExecution.execute(InterceptingClientHttpRequest.java:108)
        at org.springframework.cloud.common.security.support.OAuth2AccessTokenProvidingClientHttpRequestInterceptor.intercept(OAuth2AccessTokenProvidingClientHttpRequestInterceptor.java:65)
        at org.springframework.http.client.InterceptingClientHttpRequest$InterceptingRequestExecution.execute(InterceptingClientHttpRequest.java:92)
        at org.springframework.boot.actuate.metrics.web.client.MetricsClientHttpRequestInterceptor.intercept(MetricsClientHttpRequestInterceptor.java:65)
        at org.springframework.http.client.InterceptingClientHttpRequest$InterceptingRequestExecution.execute(InterceptingClientHttpRequest.java:92)
        at org.springframework.http.client.InterceptingClientHttpRequest.executeInternal(InterceptingClientHttpRequest.java:76)
        at org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(AbstractBufferingClientHttpRequest.java:48)
        at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:53)
        at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:735)
        ... 114 common frames omitted
Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75)
        at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
        ... 135 common frames omitted

Additional context: Add any other context about the problem here.

sabbyanandan commented 4 years ago

Hello, @eskuai. It looks like you have had an intermittent connectivity issue between SCDF and Skipper. And, it also appears that it had resumed operation afterward.

You may want to review Skipper's deployment/pod logs. Specifically, when you kubectl describe .. for these resources, you would find hints to why the Skipper deployment was choking sporadically.

eskuai commented 4 years ago

Hello @sabbyanandan,

I am trying to get some info to show you, Today, i got 3 times the same problem ...

On K8s, no pods restarted, no audit info about problem .. no connection warn... I cant understand why...

I've got to restart scdf and skipper to increase memory values ... but ... i got another connection timeout...

scdf2

[root@k8s-master ~]# kubectl describe pod scdf2-data-flow-server-fcdbc78d5-xv6nl
Name:               scdf2-data-flow-server-fcdbc78d5-xv6nl
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               node1-scdf2/10.0.1.121
Start Time:         Tue, 15 Oct 2019 17:15:14 +0200
Labels:             app=spring-cloud-data-flow
                    component=server
                    pod-template-hash=fcdbc78d5
                    release=scdf2
Annotations:        <none>
Status:             Running
IP:                 10.44.0.4
Controlled By:      ReplicaSet/scdf2-data-flow-server-fcdbc78d5
Containers:
  scdf2-data-flow-server:
    Container ID:   docker://b60be2951c49b56428a082589488514acac2c2fc0359a047c2a4db3b8287a668
    Image:          springcloud/spring-cloud-dataflow-server:2.2.1.RELEASE
    Image ID:       docker-pullable://docker.io/springcloud/spring-cloud-dataflow-server@sha256:dd8af6eac46118326172907c08ebd24c8da0f861eb67d333e88001fffb175d62
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Tue, 15 Oct 2019 17:15:14 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     1
      memory:  3Gi
    Requests:
      cpu:      600m
      memory:   768Mi
    Liveness:   http-get http://:http/management/health delay=150s timeout=50s period=60s #success=1 #failure=50
    Readiness:  http-get http://:http/management/health delay=60s timeout=50s period=15s #success=1 #failure=50
    Environment:
      LOGGING_LEVEL_ROOT:                                INFO
      KUBERNETES_NAMESPACE:                              default (v1:metadata.namespace)
      JAVA_TOOL_OPTIONS:                                 -Duser.timezone=Europe/Madrid  -Djavax.net.ssl.trustStorePassword=cc -Djavax.net.ssl.trustStore=/tmp/scdf2cacerts/cacerts -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:-TieredCompilation -XX:TieredStopAtLevel=1 -XX:+UseCompressedOops -XX:+UseCompressedClassPointers -Xverify:none  -XX:+AggressiveOpts -XX:+UseG1GC -XX:+UseStringDeduplication -Xmx2g
      SERVER_PORT:                                       8080
      SPRING_CLOUD_CONFIG_ENABLED:                       false
      SPRING_CLOUD_DATAFLOW_FEATURES_ANALYTICS_ENABLED:  false
      SPRING_JPA_OPEN_IN_VIEW:                           false
      SPRING_CLOUD_KUBERNETES_SECRETS_ENABLE_API:        true
      SPRING_CLOUD_DATAFLOW_FEATURES_SCHEDULES_ENABLED:  true
      SPRING_CLOUD_KUBERNETES_SECRETS_PATHS:             /etc/secrets
      SPRING_CLOUD_KUBERNETES_CONFIG_NAME:               scdf2-data-flow-server
      SPRING_CLOUD_SKIPPER_CLIENT_SERVER_URI:            http://${SCDF2_DATA_FLOW_SKIPPER_SERVICE_HOST}/api
      SPRING_CLOUD_DATAFLOW_SERVER_URI:                  http://${SCDF2_DATA_FLOW_SERVER_SERVICE_HOST}:${SCDF2_DATA_FLOW_SERVER_SERVICE_PORT}
      SPRING_CLOUD_DATAFLOW_SECURITY_CF_USE_UAA:         true
      SECURITY_OAUTH2_CLIENT_CLIENT_ID:                  dataflow
      SECURITY_OAUTH2_CLIENT_CLIENT_SECRET:              xxxxx
      SECURITY_OAUTH2_CLIENT_ACCESS_TOKEN_URI:           https://uaa-svc:8443/oauth/token
      SECURITY_OAUTH2_CLIENT_USER_AUTHORIZATION_URI:     https://uaa-svc:8443/oauth/authorize
      SECURITY_OAUTH2_RESOURCE_USER_INFO_URI:            https://uaa-svc:8443/userinfo
      SECURITY_OAUTH2_RESOURCE_TOKEN_INFO_URI:           https://uaa-svc:8443/check_token
      SPRING_APPLICATION_JSON:                           { "javax.net.ssl.trustStore": "/tmp/scdf2cacerts/cacerts","javax.net.ssl.trustStorePassword": "cc" , "com.sun.net.ssl.checkRevocation": "false", "maven": { "local-repository": "myLocalrepoMK", "remote-repositories": { "mk-repository": {"url": "http://${NEXUS_SERVICE_HOST}:${NEXUS_SERVICE_PORT}/repository/maven-releases/","auth": {"username": "admin","password": "aa"}},"spring-repo": {"url": "https://repo.spring.io/libs-release","auth": {"username": "","password": ""}},"spring-repo-snapshot": {"url": "https://repo.spring.io/libs-snapshot/","auth": {"username": "","password": ""}}}} }
    Mounts:
      /etc/localtime from tz-config (rw)
      /etc/secrets/database from database (ro)
      /tmp/scdf2cacerts from tmpcacerts (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from scdf2-data-flow-token-q8zgl (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  tmpcacerts:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  scdf2cacerts
    Optional:    false
  tz-config:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/share/zoneinfo/Europe/Madrid
    HostPathType:
  database:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  scdf2-database
    Optional:    false
  scdf2-data-flow-token-q8zgl:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  scdf2-data-flow-token-q8zgl
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                  Message
  ----    ------     ----  ----                  -------
  Normal  Scheduled  50m   default-scheduler     Successfully assigned default/scdf2-data-flow-server-fcdbc78d5-xv6nl to node1-scdf2
  Normal  Pulled     50m   kubelet, node1-scdf2  Container image "springcloud/spring-cloud-dataflow-server:2.2.1.RELEASE" already present on machine
  Normal  Created    50m   kubelet, node1-scdf2  Created container scdf2-data-flow-server
  Normal  Started    50m   kubelet, node1-scdf2  Started container scdf2-data-flow-server
[root@k8s-master ~]#

and skipper

root@k8s-master ~]# kubectl describe pod scdf2-data-flow-skipper-74677588f6-8qvf9
Name:               scdf2-data-flow-skipper-74677588f6-8qvf9
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               node6-scdf2-glusterfs/10.0.1.94
Start Time:         Tue, 15 Oct 2019 17:15:14 +0200
Labels:             app=spring-cloud-data-flow
                    component=skipper
                    pod-template-hash=74677588f6
                    release=scdf2
Annotations:        <none>
Status:             Running
IP:                 10.40.0.3
Controlled By:      ReplicaSet/scdf2-data-flow-skipper-74677588f6
Containers:
  scdf2-data-flow-skipper:
    Container ID:   docker://79bd5f96d4cb4d2f3329df664ea2427f35c3d67715bce12d4c2f5714b944de0b
    Image:          springcloud/spring-cloud-skipper-server:2.1.2.RELEASE
    Image ID:       docker-pullable://docker.io/springcloud/spring-cloud-skipper-server@sha256:b6ea6f8f38ec0afa03c12313303380aec9ec9a0011e92b162faa7c0a854fcc58
    Port:           7577/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Tue, 15 Oct 2019 17:15:19 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     1
      memory:  4Gi
    Requests:
      cpu:      600m
      memory:   768Mi
    Liveness:   http-get http://:http/actuator/health delay=120s timeout=60s period=60s #success=1 #failure=3
    Readiness:  http-get http://:http/actuator/health delay=120s timeout=60s period=60s #success=1 #failure=3
    Environment:
      LOGGING_LEVEL_ROOT:                             INFO
      JAVA_TOOL_OPTIONS:                              -Duser.timezone=Europe/Madrid -Djavax.net.ssl.trustStorePassword=cc -Djavax.net.ssl.trustStore=/tmp/scdf2cacerts/cacerts -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap  -XX:-TieredCompilation -XX:TieredStopAtLevel=1 -XX:+UseCompressedOops -XX:+UseCompressedClassPointers -Xverify:none  -XX:+AggressiveOpts -XX:+UseG1GC -XX:+UseStringDeduplication -Xmx2g
      KUBERNETES_NAMESPACE:                           default (v1:metadata.namespace)
      SERVER_PORT:                                    7577
      SPRING_JPA_OPEN_IN_VIEW:                        false
      SPRING_CLOUD_CONFIG_ENABLED:                    false
      SPRING_CLOUD_KUBERNETES_SECRETS_ENABLE_API:     true
      SPRING_CLOUD_KUBERNETES_SECRETS_PATHS:          /etc/secrets
      SPRING_CLOUD_KUBERNETES_CONFIG_NAME:            scdf2-data-flow-skipper
      SPRING_CLOUD_DATAFLOW_SECURITY_CF_USE_UAA:      true
      SECURITY_OAUTH2_CLIENT_CLIENT_ID:               skipper
      SECURITY_OAUTH2_CLIENT_CLIENT_SECRET:           xxxx
      SECURITY_OAUTH2_CLIENT_ACCESS_TOKEN_URI:        https://uaa-svc:8443/oauth/token
      SECURITY_OAUTH2_CLIENT_USER_AUTHORIZATION_URI:  https://uaa-svc:8443/oauth/authorize
      SECURITY_OAUTH2_RESOURCE_USER_INFO_URI:         https://uaa-svc:8443/userinfo
      SECURITY_OAUTH2_RESOURCE_TOKEN_INFO_URI:        https://uaa-svc:8443/check_token
    Mounts:
      /etc/localtime from tz-config (rw)
      /etc/secrets/database from database (ro)
      /tmp/scdf2cacerts from tmpcacerts (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from scdf2-data-flow-token-q8zgl (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  tmpcacerts:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  scdf2cacerts
    Optional:    false
  tz-config:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/share/zoneinfo/Europe/Madrid
    HostPathType:
  database:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  scdf2-database
    Optional:    false
  scdf2-data-flow-token-q8zgl:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  scdf2-data-flow-token-q8zgl
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                            Message
  ----    ------     ----  ----                            -------
  Normal  Scheduled  42m   default-scheduler               Successfully assigned default/scdf2-data-flow-skipper-74677588f6-8qvf9 to node6-scdf2-glusterfs
  Normal  Pulling    42m   kubelet, node6-scdf2-glusterfs  Pulling image "springcloud/spring-cloud-skipper-server:2.1.2.RELEASE"
  Normal  Pulled     42m   kubelet, node6-scdf2-glusterfs  Successfully pulled image "springcloud/spring-cloud-skipper-server:2.1.2.RELEASE"
  Normal  Created    42m   kubelet, node6-scdf2-glusterfs  Created container scdf2-data-flow-skipper
  Normal  Started    42m   kubelet, node6-scdf2-glusterfs  Started container scdf2-data-flow-skipper

Skipper pod is working ok, there is no log , ( debug level) and scdf shows connection timeout error ... There is only a running task under scheduling per minute ... nothing else running ..

I'll keep watching ...

sabbyanandan commented 4 years ago

Thank you for the details, @eskuai. (Aside: please make sure to review and remove any sensitive credentials from the previous comment)

Just curious. How's the K8s cluster health, and the overall resource capacity of it? Any issues/errors with the nodes on CPU/memory/network? You may want to review your network configurations within the cluster as well. Generally, though, it is hard to reason through what might attribute to the connection timeout since it is specific to your cluster and the network configurations.

To troubleshoot it from a different angle, is this happening on a specific operation in SCDF? If yes, what is it?

eskuai commented 4 years ago

Hi @sabbyanandan

K8s cluster health is ok.. there was no problem at any level, network, memory,disk ... K8s have a master and 6 nodes, [aws ec2 m4.2xlarge]... enough ram and cpu (32gb, 8vcore per instance).
We test another projects, but, the primary use is testing scdf2 platform...

scdf and skipper is running into node1 and node2

[root@k8s-master templates]# kubectl describe  node node1-scdf2 | grep -i condition -A 20 | grep Ready -B 20
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Fri, 31 May 2019 07:33:16 +0200   Fri, 31 May 2019 07:33:16 +0200   WeaveIsUp                    Weave pod has set this
  MemoryPressure       False   Tue, 15 Oct 2019 18:49:13 +0200   Wed, 29 May 2019 19:38:31 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 15 Oct 2019 18:49:13 +0200   Wed, 29 May 2019 19:38:31 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Tue, 15 Oct 2019 18:49:13 +0200   Wed, 29 May 2019 19:38:31 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Tue, 15 Oct 2019 18:49:13 +0200   Wed, 29 May 2019 19:38:51 +0200   KubeletReady                 kubelet is posting ready status
[root@k8s-master templates]# kubectl describe  node node2-scdf2 | grep -i condition -A 20 | grep Ready -B 20
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Fri, 31 May 2019 07:33:13 +0200   Fri, 31 May 2019 07:33:13 +0200   WeaveIsUp                    Weave pod has set this
  MemoryPressure       False   Tue, 15 Oct 2019 18:49:09 +0200   Wed, 29 May 2019 19:45:27 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 15 Oct 2019 18:49:09 +0200   Wed, 29 May 2019 19:45:27 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Tue, 15 Oct 2019 18:49:09 +0200   Wed, 29 May 2019 19:45:27 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Tue, 15 Oct 2019 18:49:09 +0200   Wed, 29 May 2019 19:45:47 +0200   KubeletReady                 kubelet is posting ready status

I think that I should enable node-problem-detector for a hours, but i dont think that show me nothing. I'think about it ...

There are a lot AWS alerts that warn me at 80% limit capacity, and all alerts are ok ... nothing about cpu, memory, net... this is a environment with a very very low use ...

Today, the environment is full use to test scdf task ... start, stop, destroy, restart , logging, configuration, scheduling, check log occupation, disk pods occupation, failed task management, etc .. config tz, connections, database pool size, max task execution config, etc ... that we are trying ...

There are only a single task running into scheduling /1 * , reads k8s secrets and config and print ... nothing else ... and logs are show, datetime, etc...

sabbyanandan commented 4 years ago

Thanks for a thorough walkthrough. If the primary use of SCDF is just for Tasks, you don't need Skipper in the deployment at all. You can even disable the streaming features cleanly by setting SPRING_CLOUD_DATAFLOW_FEATURES_STREAMS_ENABLED to false. Something to think about, though.

I know that doesn't answer or address the connectivity issue that you see in your setup, but I do not have any other ideas. Let's see if @chrisjs has any thoughts to share.

chrisjs commented 4 years ago

intermittent connectivity issues between pods which appear to be scheduled on two different nodes:

with no pod restarts due to resource issues, etc would lead me to believe there's some sort of connectivity issue in your setup. in one of the logs i also saw something with Weave, not sure if this is being used as a networking component or whatnot.

if this is something while intermittent, but happens enough that its likely to be reproduced easily, one could rule out connectivity issues by ensuring both pods are being scheduled on the same node via a deployer property, ie: https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#configuration-kubernetes-deployer

for example, depending on your needs/requirements, some ideas would be nodeSelector, tolerations etc. try pegging to same node, try pegging to other nodes, etc - might help narrow down the problem. remove anything not needed such as maybe Weave if its part of your networking stack, etc.

those would likely be the simplest testing approaches to start with.

eskuai commented 4 years ago

Hi @chrisjs, thank you for the info and the plan ...

  1. I'll try to deploy both in the same node
  2. Then, try to check -kube-proxy Subtleties: Debugging an Intermittent Connection Reset- https://kubernetes.io/blog/2019/03/29/kube-proxy-subtleties-debugging-an-intermittent-connection-reset/

But, why doesn't it happen never applying scdf 1.7.3 .. we are deploying same apps ... aarrggg Let's go testing

eskuai commented 4 years ago

About k8s use,

1) No problem running scdf2 and skipper into the same pod 2) Check avoid use firewalld in your worker nodes ... disabled it if you can 3) If you can't disable, watch out about SNAT using netfilter 4) Disabled firewall, watch out about kernel tcp config 4.1) Check using conntrack , verify time-outs 4.2) Update sysctl.conf config

sysctl -w net.ipv4.tcp_fin_timeout=20   
sysctl -w net.ipv4.tcp_tw_reuse=1

5) Apply stable/node-problem-detector for a time getting logs . 6) using weave cni, check for big mtu size 7) if using ec2, disable network enable source/dest checking

3 days, no fails

Tx

sabbyanandan commented 4 years ago

@eskuai: Thank you for the update. This would help others if they run into similar issues.