spring-cloud / spring-cloud-netflix

Integration with Netflix OSS components
http://cloud.spring.io/spring-cloud-netflix/
Apache License 2.0
4.87k stars 2.44k forks source link

LoadBalanced with RestTemplate error in concurrent environment #2562

Closed ROBOSI closed 6 years ago

ROBOSI commented 6 years ago

In my app, i used RestTemplate to post my request, The code is as follows

        @LoadBalanced
    @Bean
    public RestTemplate restTemplate() {
        return new RestTemplate();
    }

and rest client base class is:

public class RestOperater {

    @Autowired
    private RestTemplate restTemplate;

    public PostRes post(PostReq req) {
        return restTemplate.postForObject(req.toRestUrlStr(), req.getT(), PostRes.class);
    }
}

i implemented 2 classes which extends RestOperater, then i used loadrunner simulate 10 users to execute post,about 1 minute later, some errors occurred:

org.springframework.web.client.HttpClientErrorException: 404 null
    at org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:63)
    at org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:700)
    at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:653)
    at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:613)
    at org.springframework.web.client.RestTemplate.postForObject(RestTemplate.java:380)
    at org.springframework.web.client.RestTemplate$$FastClassBySpringCGLIB$$aa4e9ed0.invoke(<generated>)
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
    at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:738)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
    at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
    at org.springframework.cloud.netflix.metrics.RestTemplateUrlTemplateCapturingAspect.captureUrlTemplate(RestTemplateUrlTemplateCapturingAspect.java:33)
    at sun.reflect.GeneratedMethodAccessor492.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)

then i debuged it and found out why: when a post request (suppose named A) should posted to ipA:portA/serviceIdA/.., but it was converted to ipB:portB/serviceIdA/.. by @LoadBalanced, the ip and port are wrong, so it caused 404, i wanna to know what is wrong in Ribbon when it run in a concurrent environment .

spencergibb commented 6 years ago

Please learn how to properly format code and logs.

Without knowing more about your application and setup it is impossible to help. Perhaps you can provide a sample application that reproduces the behavior.

ROBOSI commented 6 years ago

@spencergibb thanks,this problem is very similar with https://github.com/spring-cloud/spring-cloud-netflix/issues/2025, or it maybe the same problem, when i used the default ribbon configuration, should i use @RibbonClient ?

spencergibb commented 6 years ago

Yes you should

ROBOSI commented 6 years ago

@spencergibb thanks again, my development environment is configured as follows:

all services are registered to Eureka and service A will call service B and C by rest. if A concurrent access to B, C with large traffic. then 404 null sometimes occured.

my RestTemplate Configuration and my base rest client code as follows:

@Configuration
public class RestTemplateConfiguration {

    @LoadBalanced
    @Bean
    public RestTemplate restTemplate() {
        return new RestTemplate();
    }

}
public class RestOperater {

    @Autowired
    private RestTemplate restTemplate;

    public PostRes post(PostReq req) {
        return restTemplate.postForObject(req.toRestUrlStr(), req.getT(), PostRes.class);
    }

    public GetRes get(GetReq req) {
        return restTemplate.getForObject(req.toRestUrlStr(), GetRes.class);
    }

    public String get(String url) {
        return restTemplate.getForObject(url, String.class);
    }

    public <T> T post(String url, String postReq, Class<T> responseType) {
        HttpHeaders headers = new HttpHeaders();
        MediaType type = MediaType.parseMediaType("application/json;charset=UTF-8");
        headers.setContentType(type);
        HttpEntity<String> requestEntity = new HttpEntity<>(postReq, headers);
        return restTemplate.postForObject(url, requestEntity, responseType);
    }
}

i implemented 2 rest clients which extends RestOperater, when i debuged i hit a breakpoint in org.springframework.cloud.client.loadbalancer.RetryLoadBalancerInterceptor.intercept() as shown in the following figure

image and i found the request parameter is correct,it should be posted to 'ledger-services-flowcontrol'

image

and retryPolicy is also correct, but after execute this sentence which line number is 46 in the first picture:

this.retryTemplate.setRetryPolicy((RetryPolicy)(this.lbProperties.isEnabled() && retryPolicy != null ? new InterceptorRetryPolicy(request, retryPolicy, this.loadBalancer, serviceName) : new NeverRetryPolicy()));

the RetryContext was chaged wrong, it was posted to 'ledger-services-remote-pdf': image

i think it is due to thread insecurity, the sentence which line number is 46 in the first picture is thread unsafe, is it a bug?

spencergibb commented 6 years ago

No, it's your failure to configure ribbon correctly as was pointed out in #2025. Show me your ribbon configuration.

ROBOSI commented 6 years ago

my bootstrap.yml:


server:
  port: ${port:8017}
  tomcat:
    accept-count: 200
    max-threads: 200
    max-connections: 200
info:
  build:
    artifact: @project.artifactId@
    name: @project.name@
    description: @project.description@
    version: @project.version@

spring:
   sleuth:
      sampler:
        percentage: 1.0
      web:
        skipPattern: "/api-docs.*|/autoconfig|/configprops|/dump|/health|/info|/metrics.*|/mappings|/trace|/swagger.*|.*\\.png|.*\\.css|.*\\.js|.*\\.html|/favicon.ico|/hystrix.stream|/druid.*"
   application:
      name: @project.name@
   profiles:
      active: ${profile:default}
   cloud:
      config:
        name: ${spring.application.name}
        discovery:
          enabled: true
          serviceId: config-server
        profile: ${profile:default}
        fail-fast: true
        retry:
          initial-interval: 10000
          multiplier: 1
          max-interval: 20000
          max-attempts: 6   
eureka:
  client:
    register-with-eureka: true
    fetch-registry: true
  instance:
    lease-renewal-interval-in-seconds: 5
    lease-expiration-duration-in-seconds: 13
    prefer-ip-address: true
management:
  security:
    enabled: false
endpoints:
  health:
    enabled: true
    sensitive: false

---
spring:
  profiles: default
  rabbitmq:
     host: 111.143.128.111
     port: 5672
     username: test1
     password: test1
     virtualHost: /
eureka:
  client:
    serviceUrl:
      defaultZone: http://admin1:admin1@111.143.128.61:8761/eureka/,http://admin1:admin1@111.143.128.41:8761/eureka/
logging:
  file:  ${spring.application.name}.log

---
spring:
  profiles: dev
  rabbitmq:
     host: 111.143.128.111
     port: 5672
     username: test1
     password: test1
     virtualHost: /
eureka:
  client:
    serviceUrl:
      defaultZone: http://admin1:admin1@111.143.128.61:8761/eureka/,http://admin1:admin1@111.143.128.41:8761/eureka/
logging:
  file:  ${spring.application.name}.log

---
spring:
  profiles: press
  rabbitmq:
     host: 111.143.129.151
     port: 5672
     username: admin
     password: admin
     virtualHost: /
eureka:
  client:
    serviceUrl:
      defaultZone: http://admin1:admin1@111.143.129.152:8761/eureka/,http://admin1:admin1@111.143.129.153:8761/eureka/
logging:
  file:  ${spring.application.name}.log

my config server service yml:

hystrix:
  command:
    flowControlCommand:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 300000
      circuitBreaker:
        requestVolumeThreshold: 10
        errorThresholdPercentage: 90
    pdfCommand:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 300000
      circuitBreaker:
        requestVolumeThreshold: 10
        errorThresholdPercentage: 90
  threadpool:
    flowControlThreadPool:
      coreSize: 200
    pdfThreadPool:
      coreSize: 200

no extra configuration

spencergibb commented 6 years ago

Sorry I wasn't clear enough, the Java @RibbonClient configuration

ROBOSI commented 6 years ago

@spencergibb i have not used the @RibbonClient, and have no related configuration.

Service A use resttemplate to access service B and C with load traffic, there should be many threads to share this resttemplate instance(It is a bean), according to our understanding, if there are two thread use this restempalte to access B and C service.

  1. A access B firstly(URL : http://LEDGER-SERVICES-FLOWCONTROL/flowcontrol/flowAdd), and then C(URL: http://LEDGER-SERVICES-REMOTE-PDF/baseService/pdf/ceCalculatedBorrowingRateReqGbk).

  2. During accessing http://LEDGER-SERVICES-FLOWCONTROL/flowcontrol/flowAdd Another thread also call RetryLoadBalancerInterceptor.intercept(), and set RetryPolicy, which will be overrided, and lead to access http://LEDGER-SERVICES-REMOTE-PDF/flowcontrol/flowAdd, which will cause 404 null.

Could you please help analyze our investigation, and give us some guidance to fix this issue?

ROBOSI commented 6 years ago

my previous spring cloud version is Dalston SR1, in spring cloud version Dalston SR4, this error is disappeared. thanks

spencergibb commented 6 years ago

If the problem is fixed in a later version, why reopen?