spring-cloud / spring-cloud-netflix

Integration with Netflix OSS components
http://cloud.spring.io/spring-cloud-netflix/
Apache License 2.0
4.87k stars 2.44k forks source link

Read Timeout on Request #321

Closed aljacinto closed 9 years ago

aljacinto commented 9 years ago

I'm getting read timeout for a service that is taking a long time (error below). No problem with other service in the same machine, I can see result just the service that takes a long time. I set the ribbon readtimeout, no changes. It seems to be set at 15 seconds. Here's my application.yaml

ribbon:
  ReadTimeout: 30000
  ConnectTimeout: 30000
zuul:
  routes:
    myservice:
      path: /service/**
      url: http://ip_here:8081/service

If I want it per service, do I put it below myservice? I tried that as well, no luck.

java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:152)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
        at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
        at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
        at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
        at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:219)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:712)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:517)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:827)
        at org.springframework.cloud.netflix.zuul.filters.route.SimpleHostRoutingFilter.forwardRequest(SimpleHostRoutingFilter.java:262)
        at org.springframework.cloud.netflix.zuul.filters.route.SitRoutingFilter.forward(SimpleHostRoutingFilter.java:225)
        at org.springframework.cloud.netflix.zuul.filters.route.SimpleHostRoutingFilter.run(SimpleHostRoutingFilter.java:177)
        at com.netflix.zuul.ZuulFilter.runFilter(ZuulFilter.java:112)
        at com.netflix.zuul.FilterProcessor.processZuulFilter(FilterProcessor.java:197)
        at com.netflix.zuul.FilterProcessor.runFilters(FilterProcessor.java:161)
        at com.netflix.zuul.FilterProcessor.route(FilterProcessor.java:120)
        at com.netflix.zuul.ZuulRunner.route(ZuulRunner.java:84)
        at com.netflix.zuul.http.ZuulServlet.route(ZuulServlet.java:111)
        at com.netflix.zuul.http.ZuulServlet.service(ZuulServlet.java:77)

Thanks in advance.

aljacinto commented 9 years ago

Ok, read more, suppose to be like this for it to use ribbon:

 zuul:
  routes:
    users:
      path: /myusers/**
      serviceId: users

 users:
  ribbon:
    listOfServers: example.com,google.com

Thanks, getting Hystrix error however

aljacinto commented 9 years ago

Doesn't look like the ReadTimeout is taking effect, am I doing it right? Again, I can hit the fast service, error just occurs on slow service. Thanks

  routes:
    myservice:
      path: /service/**
      serviceId: myservice
      stripPrefix: false

myservice:
  ribbon:
    ReadTimeout: 60000
    ConnectTimeout: 60000
    listOfServers: localhost:8081

Getting this error:

com.netflix.zuul.exception.ZuulException: Forwarding error
        at org.springframework.cloud.netflix.zuul.filters.route.RibbonRoutingFilter.forward(RibbonRoutingFilter.java:143)
        at org.springframework.cloud.netflix.zuul.filters.route.RibbonRoutingFilter.run(RibbonRoutingFilter.java:107)
        at com.netflix.zuul.ZuulFilter.runFilter(ZuulFilter.java:112)
        at com.netflix.zuul.FilterProcessor.processZuulFilter(FilterProcessor.java:197)
        at com.netflix.zuul.FilterProcessor.runFilters(FilterProcessor.java:161)
        at com.netflix.zuul.FilterProcessor.route(FilterProcessor.java:120)
        at com.netflix.zuul.ZuulRunner.route(ZuulRunner.java:84)
        at com.netflix.zuul.http.ZuulServlet.route(ZuulServlet.java:111)
        at com.netflix.zuul.http.ZuulServlet.service(ZuulServlet.java:77)
        at org.springframework.web.servlet.mvc.ServletWrappingController.handleRequestInternal(ServletWrappingController.java:158)
        at org.springframework.cloud.netflix.zuul.web.ZuulController.handleRequestInternal(ZuulController.java:43)
        at org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:146)
        at org.springframework.web.servlet.mvc.SimpleControllerHandlerAdapter.handle(SimpleControllerHandlerAdapter.java:50)
        at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:959)
        at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:893)
        at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:966)
        at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:857)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:618)
        at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:842)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:725)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.springframework.boot.actuate.autoconfigure.EndpointWebMvcAutoConfiguration$ApplicationContextHeaderFilter.doFilterInternal(EndpointWebMvcAutoConfiguration.java:291)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:77)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.springframework.boot.actuate.trace.WebRequestTraceFilter.doFilterInternal(WebRequestTraceFilter.java:102)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:85)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterava:206)
        at org.springframework.boot.actuate.autoconfigure.MetricFilterAutoConfiguration$MetricsFilter.doFilterInternal(MetricFilterAutoConfiguration.java:90)
        at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:501)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:142)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:516)
        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1086)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:659)
        at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:223)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1558)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1515)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:745)
Caused by: com.netflix.hystrix.exception.HystrixRuntimeException: myserviceRibbonCommand timed-out and no fallback available.
        at com.netflix.hystrix.AbstractCommand$16.call(AbstractCommand82)va:7
        at com.netflix.hystrix.AbstractCommand$16.call(AbstractCommand.java:769)
        at rx.internal.operators.OperatorOnErrorResumeNextViaFunction$1.onError(OperatorOnErrorResumeNextViaFunction.java:76)
        at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:70)
        at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:70)
        at rx.internal.operators.OperatorDoOnEach$1.onError(OperatorDoOnEach.java:70)
        at com.netflix.hystrix.AbstractCommand$DeprecatedOnFallbackHookApplication$1.onError(AbstractCommand.java:1444)
        at com.netflix.hystrix.AbstractCommand$FallbackHookApplication$1.onError(AbstractCommand.java:1334)
        at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:314)
        at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:306)
        at rx.Observable$2.call(Observable.java:173)
        at rx.Observable$2.call(Observable.java:166)
        at rx.Observable$2.call(Observable.java:173)
        at rx.Observable$2.call(Observable.java:166)
        at rx.Observable$2.call(Observable.ja
brenuart commented 9 years ago

Don't forget Zuul is using Hystric underneath - so you may well have a Hystrix timeout instead (1000ms by default). You can either disable the timeout feature of Hystrix or increase its value:

# Disable Hystrix timeout globally (for all services)
hystrix.command.default.execution.timeout.enabled: false

# Disable Hystrix timeout for a single service
hystrix.command.<serviceName>.execution.timeout.enabled: false

# Increase the Hystrix timeout to 60s (globally)
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds: 60000

# Increase the Hystrix timeout to 60s (per service)
hystrix.command.<serviceName>.execution.isolation.thread.timeoutInMilliseconds: 60000

Be careful however with the last two properties as they apply to the THREAD isolation strategy. Although the Netflix documentation says it is the default, I noticed SpringCloud version of Zuul defaults to SEMAPHORE. See https://github.com/Netflix/Hystrix/wiki/Configuration for more information.

aljacinto commented 9 years ago

Works, thanks brenuart, appreciate it. When does that ReadTimeout from Ribbon takes effect? Also, it looks like I can't flop that in the service section similar to the ribbon because the ribbon starts with client.ribbon.... where this one starts with hystrix.command.

Thanks again.

brenuart commented 9 years ago

The ReadTimeout is targeted at the the underlying RestClient that makes the actual call to the service. So, when does it take effect? Don't forget Ribbon is there to perform routing, failover and load balancing - so a single client request may lead to multiple attempts to contact a service. Stated differently, if you have two instances of a service and one is not reachable, you end up with two RestClient calls within a single Hystrix command. A bit like multiple database queries (RestClient calls) within the same transaction (HystricCommand). The Hystrix timeout is like the transaction timeout, the RestClient ReadTimeout is like the query timeout... (hope it is clear)

As far as I can tell, you should be able to override it by adding the following to your Zuul configuration:

# Global - affects every Ribbon instances (not only your services but also Eureka client!)
ribbon.ReadTimeout = 6000

# Per service
<clientName>.ribbon.ReadTimeout = 60000
aljacinto commented 9 years ago

Got it, if my hystrix timeout is less than the ribbon timeout then ribbon timeout does not matter. I can override the ribbon Timeout with no issue. The Hystrix, can that be done under the service?? For example I have

myservice:
  ribbon:
    ReadTimeout: 60000
    ConnectTimeout: 60000
    [Can I put something here for hystrix???]

BTW, I couldn't get the second one to work, I had to set it globally.

This works:

hystrix.command.default.execution.timeout.enabled: false

but this doesn't:

hystrix.command.myservice.execution.timeout.enabled: false
spencergibb commented 9 years ago

It would be hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds or hystrix.command. myservice.execution.isolation.thread.timeoutInMilliseconds

brenuart commented 9 years ago

If your issue is with Hystrix within Zuul - as it seems to be the case - you should be aware that Hystrix execution isolation is set to SEMAPHORE instead of THREAD which is the default everywhere else Hystrix is used.

Consequence is the hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds property proposed by @spencergibb won't do anything since it is targeted to the THREAD isolation.

See https://github.com/Netflix/Hystrix/wiki/Configuration#execution.isolation.strategy for more information.

aljacinto commented 9 years ago

My issue is disabling of timeout for a specific service does not work. I had to go with the global one.

hystrix.command.myservice.execution.timeout.enabled: false
brenuart commented 9 years ago

You are right - the above advices are not 100% correct... In Zuul, the Hystrix property names should be built according to the following pattern:

# For a default value (i.e. applies to every services unless otherwise specified)
hystrix.command.default.<property name> = <value>

# For a per-service configuration
hystrix.command.<service name>RibbonCommand.<property name> = <value>

Note the RibbonCommand after . This is rather confusing I agree, and is not in line with Netflix's documentation...

brenuart commented 9 years ago

See https://github.com/spring-cloud/spring-cloud-netflix/issues/329 for a follow up.

ljramones commented 9 years ago

Interesting, I moved my zuul server application from 1.0.0 to 1.0.1 and I ran into this exact issue. Suddenly requests passing through to other services were timing out with exceptions if the request lasted longer that a few seconds. Luckily a search showed this page so we could do the release after I added the appropriate flags to the application.yml. Just to verify, I moved the parent pom back to 1.0.0 and the timeout exceptions stopped. I guess there was some fairly major surgery underneath

spencergibb commented 9 years ago

@larry13767 we switched to use the ribbon http client which would explain the change in defaults.

ajanitin commented 9 years ago

Set appropriate timeout period of the property zuul.host.socket-timeout-millis = XXXXX

petercartagena commented 9 years ago

Thanks @ajanitin zuul.host.connect-timeout-millis: xxx zuul.host.socket-timeout-millis: xxx

This works for me.

kaszuster commented 9 years ago

The same error happened to me randomly. I had to wait some time and trigger the request again or restart the service completely to reproduce this behavior.

Changing the configuration to:

zuul:
    host:
        connect-timeout-millis: 10000
        socket-timeout-millis: 60000

hystrix:
    command:
        default:
            execution:
                isolation:
                    thread:
                        timeoutInMilliseconds: 60000

Ends up in some calls having a "delay" of about 2 seconds, but no forwarding error is popping up.

codependent commented 8 years ago

@brenuart I have been trying to configure the timeout per service with:

hystrix:
  command:
    random-story-microserviceRibbonCommand:
      execution:
        isolation:
          strategy: THREAD
          thread:
            timeoutInMilliseconds: 20000```

However it doesn't work. Debugging in RestClientRibbonCommand's constructor:

public RestClientRibbonCommand(String commandKey, RestClient restClient, Verb verb, String uri,
                                   Boolean retryable,
                                   MultiValueMap<String, String> headers,
                                   MultiValueMap<String, String> params, InputStream requestEntity)
            throws URISyntaxException {
        super(getSetter(commandKey));

super(getSetter(commandKey)) creates the HystrixCommand with the parameters obtained from getSetter(commandKey)). As you can see below, aparently it never considers the properties configured per service, it will always be a semaphore:

protected static HystrixCommand.Setter getSetter(String commandKey) {
        // we want to default to semaphore-isolation since this wraps
        // 2 others commands that are already thread isolated
        String name = ZuulConstants.ZUUL_EUREKA + commandKey + ".semaphore.maxSemaphores";
        DynamicIntProperty value = DynamicPropertyFactory.getInstance().getIntProperty(
                name, 100);
        HystrixCommandProperties.Setter setter = HystrixCommandProperties.Setter()
                .withExecutionIsolationStrategy(ExecutionIsolationStrategy.SEMAPHORE)
                .withExecutionIsolationSemaphoreMaxConcurrentRequests(value.get());
        return Setter.withGroupKey(HystrixCommandGroupKey.Factory.asKey("RibbonCommand"))
                .andCommandKey(HystrixCommandKey.Factory.asKey(commandKey))
                .andCommandPropertiesDefaults(setter);
    }
brenuart commented 8 years ago

Yes, it will always be semaphore – despites the configuration you use (hardcoded in code I’m afraid). See http://stackoverflow.com/questions/29965817/why-is-zuul-forcing-a-semaphore-isolation-to-execute-its-hystrix-commands for more information and potential workaround.

agupta1989 commented 8 years ago

@spencergibb and @brenuart Thanks to both of you. Clear explaination !

The ReadTimeout is targeted at the the underlying RestClient....

this explaination equally good :+1: :)

ilayze commented 8 years ago

Hi, I tried to use: zuul.host.socket-timeout-millis: 100 zuul.host.connect-timeout-millis: 100

I am expecting that zuul will timeout after 100 miliseconds but this is not happening. Is there extra configuration for it to work?

ryanjbaxter commented 7 years ago

Please open a new issue if you are having a problem. Just because a property does not come up in the auto complete list in STS does not mean it is not valid.

lite2073 commented 7 years ago

I know this is a closed issue, but I'm not sure where this comment should go. It would be nice to document the additional zuul properties such as the timeout settings used by SimpleHostRoutingFilter. It took me a while to figure out why ribbon.ReadTimeout and ribbon.ConnectTimeout had no effect on zuul request timeouts. Without documentation, it's hard to know which native Netflix properties are used to adjust the default behavior of Spring Cloud Netflix components.

ryanjbaxter commented 7 years ago

Like I said above if there is documentation you want improved please open a new issue for that and we can take a look at it. Also pull requests are always welcome.

stiyyagura commented 7 years ago

@spencergibb @ryanjbaxter hystrix.command.default.execution.timeout.enabled=false this configuration is not working as i recently started using Ribbon to leverage RibbonRoutingFilter to avoid the issue i am facing with MIME types #1663 but now i started seeing timeout/forward failed issue. any help appreciated.

stiyyagura commented 7 years ago

@spencergibb @ryanjbaxter Apologies this was an internal issue.

coxjonc commented 7 years ago

I know this issue is closed, but for any people running into this problem for the first time the solutions above didn't work for me, but setting zuul.host.connect-timeout-millis and zuul.host.socket-timeout-millis did the trick. See this issue discussion for details

hodeh commented 7 years ago

the only way I got it working is by disabling hystrix and setting those values, setting those properties from .propterites files did not work for me I had to access the com.netflix.config.ConfigurationManager

    @PostConstruct
    void disableHystrix() {

        ConfigurationManager.getConfigInstance().setProperty("hystrix.command.default.circuitBreaker.enabled", false);
        ConfigurationManager.getConfigInstance().setProperty("hystrix.command.default.execution.timeout.enabled", false);
        ConfigurationManager.getConfigInstance().setProperty("hystrix.command.default.execution.isolation.thread.interruptOnTimeout", false);
        ConfigurationManager.getConfigInstance().setProperty("hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds", apiGatewayConfigs.getGlobalTimeout());
        ConfigurationManager.getConfigInstance().setProperty("ribbon.ReadTimeout", apiGatewayConfigs.getGlobalTimeout());

    }
pm0202 commented 7 years ago

Thank you hodeh! We tried several settings as recommended but this is the only way it worked for me.

Gopoop commented 6 years ago

server: port: 8050

spring: application: name: web-gateway-zuul

eureka: client: serviceUrl: defaultZone: http://discovery:8761/eureka/

defaultZone: http://192.168.10.123:8761/eureka/

instance: hostname: gateway perferIpAddress: true instance-id: ${spring.cloud.client.ipAddress}:${server.port}

zuul: routes: sys: path: /base/** service-id: cof-ms-base

resolve the service time-out exception

hystrix: command: default: execution: isolation: thread: timeoutInMilliseconds: 20000 # Increase the Hystrix timeout to 60s (globally)

ribbon: ReadTimeout: 20000 ConnectTimeout: 20000

leiy88 commented 6 years ago

I make it out. To make zuul & ribbon timeout to work, you shoule set hystrix.command.default.execution.timeout.enable to false. WTF!

hystrix:
  command:
    default:
      execution:
        timeout:
          enabled: false

zuul:
  host:
    socket-timeout-millis: 5000
    connect-timeout-millis: 5000

ribbon:
    ConnectTimeout: 5000
    ReadTimeout: 5000
soyoeml commented 6 years ago

The upstairs doesn't work for me. Which yml file to modify? The microservice yml file or gateway project?

alexis-globant commented 6 years ago

Only adding this configuration works for me:

ribbon:
  ReadTimeout: 20000
  ConnectTimeout: 20000