Closed tony-clarke-amdocs closed 4 years ago
/cc @violetagg
@tony-clarke-amdocs Any chance for a repro?
@violetagg I will work on one. It's wrapped up in some company specific infrastructure that I need to decouple from to make it standalone.
If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.
Will try to get repot once I free up.
Thanks @tony-clarke-amdocs we know you're good for it :-)
Hi @OlgaMaciaszek @violetagg Please see this repot for a way to reproduce the problem. It is tricky to get the test to fail reliably across all hardware. But so far I have been able to get it to fail on Linux, Windows and Macbook...but each platform requires adjusting the load parameters. I have ran it 6 times on my MacBook. 3 times with 0.9.7 and 3 times with 0.9.10. All runs with 0.9.7 are fine but all runs with 0.9.10 caused a timeout on the client side to occur.
@tony-clarke-amdocs Thanks! I'm looking at it.
@tony-clarke-amdocs Will you be able to test the current 0.9.13.BUILD-SNAPSHOT?
@violetagg is there something you see that suggests 0.9.13.BUILD-SNAPSHOT has fixed the bug? BTW you can switch the version yourself, just set the version in gateway/pom.xml
@violetagg Tested with 0.9.13.BUILD-SNAPSHOT and the issue is still there.
The issue in Reactor Netty is https://github.com/reactor/reactor-netty/pull/1370
@tony-clarke-amdocs Do you think you can verify this PR https://github.com/reactor/reactor-netty/pull/1371 ? I do not see anymore issues with that fix.
Tried the same tests as I did in https://github.com/spring-cloud/spring-cloud-gateway/issues/2020 with reactor-netty 0.9.7.RELEASE and 0.9.9.RELEASE the results are merely the same. Hoxton.SR8 still has poor performance compared with Greenwich.SR6
@buhtr
Tried the same tests as I did in #2020 with reactor-netty 0.9.7.RELEASE and 0.9.9.RELEASE the results are merely the same.
So the PR above makes the results comparable with what you have with Reactor Netty 0.9.7/0.9.9?
@violetagg
The load test gives the following benchmark:
So there are no any performance improvements using reactor-netty 0.9.7
The dependencies with reactor-netty 0.9.7
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------< com.example:gateway >-------------------------
[INFO] Building gateway 0.0.1-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-dependency-plugin:3.1.2:tree (default-cli) @ gateway ---
[INFO] com.example:gateway:jar:0.0.1-SNAPSHOT
[INFO] +- org.springframework.cloud:spring-cloud-starter-gateway:jar:2.2.5.RELEASE:compile
[INFO] | +- org.springframework.cloud:spring-cloud-starter:jar:2.2.5.RELEASE:compile
[INFO] | | +- org.springframework.cloud:spring-cloud-context:jar:2.2.5.RELEASE:compile
[INFO] | | | \- org.springframework.security:spring-security-crypto:jar:5.3.5.RELEASE:compile
[INFO] | | +- org.springframework.cloud:spring-cloud-commons:jar:2.2.5.RELEASE:compile
[INFO] | | \- org.springframework.security:spring-security-rsa:jar:1.0.9.RELEASE:compile
[INFO] | | \- org.bouncycastle:bcpkix-jdk15on:jar:1.64:compile
[INFO] | | \- org.bouncycastle:bcprov-jdk15on:jar:1.64:compile
[INFO] | \- org.springframework.cloud:spring-cloud-gateway-core:jar:2.2.5.RELEASE:compile
[INFO] | +- org.springframework.boot:spring-boot-starter-validation:jar:2.3.5.RELEASE:compile
[INFO] | | +- org.glassfish:jakarta.el:jar:3.0.3:compile
[INFO] | | \- org.hibernate.validator:hibernate-validator:jar:6.1.6.Final:compile
[INFO] | | +- jakarta.validation:jakarta.validation-api:jar:2.0.2:compile
[INFO] | | +- org.jboss.logging:jboss-logging:jar:3.4.1.Final:compile
[INFO] | | \- com.fasterxml:classmate:jar:1.5.1:compile
[INFO] | \- io.projectreactor.addons:reactor-extra:jar:3.3.4.RELEASE:compile
[INFO] +- org.springframework.boot:spring-boot-starter-webflux:jar:2.3.5.RELEASE:compile
[INFO] | +- org.springframework.boot:spring-boot-starter:jar:2.3.5.RELEASE:compile
[INFO] | | +- org.springframework.boot:spring-boot:jar:2.3.5.RELEASE:compile
[INFO] | | | \- org.springframework:spring-context:jar:5.2.10.RELEASE:compile
[INFO] | | | +- org.springframework:spring-aop:jar:5.2.10.RELEASE:compile
[INFO] | | | \- org.springframework:spring-expression:jar:5.2.10.RELEASE:compile
[INFO] | | +- org.springframework.boot:spring-boot-autoconfigure:jar:2.3.5.RELEASE:compile
[INFO] | | +- org.springframework.boot:spring-boot-starter-logging:jar:2.3.5.RELEASE:compile
[INFO] | | | +- ch.qos.logback:logback-classic:jar:1.2.3:compile
[INFO] | | | | \- ch.qos.logback:logback-core:jar:1.2.3:compile
[INFO] | | | +- org.apache.logging.log4j:log4j-to-slf4j:jar:2.13.3:compile
[INFO] | | | | \- org.apache.logging.log4j:log4j-api:jar:2.13.3:compile
[INFO] | | | \- org.slf4j:jul-to-slf4j:jar:1.7.30:compile
[INFO] | | +- jakarta.annotation:jakarta.annotation-api:jar:1.3.5:compile
[INFO] | | \- org.yaml:snakeyaml:jar:1.26:compile
[INFO] | +- org.springframework.boot:spring-boot-starter-json:jar:2.3.5.RELEASE:compile
[INFO] | | +- com.fasterxml.jackson.core:jackson-databind:jar:2.11.3:compile
[INFO] | | | +- com.fasterxml.jackson.core:jackson-annotations:jar:2.11.3:compile
[INFO] | | | \- com.fasterxml.jackson.core:jackson-core:jar:2.11.3:compile
[INFO] | | +- com.fasterxml.jackson.datatype:jackson-datatype-jdk8:jar:2.11.3:compile
[INFO] | | +- com.fasterxml.jackson.datatype:jackson-datatype-jsr310:jar:2.11.3:compile
[INFO] | | \- com.fasterxml.jackson.module:jackson-module-parameter-names:jar:2.11.3:compile
[INFO] | +- org.springframework.boot:spring-boot-starter-reactor-netty:jar:2.3.5.RELEASE:compile
[INFO] | +- org.springframework:spring-web:jar:5.2.10.RELEASE:compile
[INFO] | | \- org.springframework:spring-beans:jar:5.2.10.RELEASE:compile
[INFO] | +- org.springframework:spring-webflux:jar:5.2.10.RELEASE:compile
[INFO] | \- org.synchronoss.cloud:nio-multipart-parser:jar:1.1.0:compile
[INFO] | +- org.slf4j:slf4j-api:jar:1.7.30:compile
[INFO] | \- org.synchronoss.cloud:nio-stream-storage:jar:1.1.3:compile
[INFO] +- io.projectreactor.netty:reactor-netty:jar:0.9.7.RELEASE:compile
[INFO] | +- io.netty:netty-codec-http:jar:4.1.53.Final:compile
[INFO] | | +- io.netty:netty-common:jar:4.1.53.Final:compile
[INFO] | | +- io.netty:netty-buffer:jar:4.1.53.Final:compile
[INFO] | | +- io.netty:netty-transport:jar:4.1.53.Final:compile
[INFO] | | \- io.netty:netty-codec:jar:4.1.53.Final:compile
[INFO] | +- io.netty:netty-codec-http2:jar:4.1.53.Final:compile
[INFO] | +- io.netty:netty-handler:jar:4.1.53.Final:compile
[INFO] | | \- io.netty:netty-resolver:jar:4.1.53.Final:compile
[INFO] | +- io.netty:netty-handler-proxy:jar:4.1.53.Final:compile
[INFO] | | \- io.netty:netty-codec-socks:jar:4.1.53.Final:compile
[INFO] | +- io.netty:netty-transport-native-epoll:jar:linux-x86_64:4.1.53.Final:compile
[INFO] | | \- io.netty:netty-transport-native-unix-common:jar:4.1.53.Final:compile
[INFO] | \- io.projectreactor:reactor-core:jar:3.3.11.RELEASE:compile
[INFO] | \- org.reactivestreams:reactive-streams:jar:1.0.3:compile
[INFO] +- org.springframework.boot:spring-boot-starter-test:jar:2.3.5.RELEASE:test
[INFO] | +- org.springframework.boot:spring-boot-test:jar:2.3.5.RELEASE:test
[INFO] | +- org.springframework.boot:spring-boot-test-autoconfigure:jar:2.3.5.RELEASE:test
[INFO] | +- com.jayway.jsonpath:json-path:jar:2.4.0:test
[INFO] | | \- net.minidev:json-smart:jar:2.3:test
[INFO] | | \- net.minidev:accessors-smart:jar:1.2:test
[INFO] | | \- org.ow2.asm:asm:jar:5.0.4:test
[INFO] | +- jakarta.xml.bind:jakarta.xml.bind-api:jar:2.3.3:test
[INFO] | | \- jakarta.activation:jakarta.activation-api:jar:1.2.2:test
[INFO] | +- org.assertj:assertj-core:jar:3.16.1:test
[INFO] | +- org.hamcrest:hamcrest:jar:2.2:test
[INFO] | +- org.junit.jupiter:junit-jupiter:jar:5.6.3:test
[INFO] | | +- org.junit.jupiter:junit-jupiter-api:jar:5.6.3:test
[INFO] | | | +- org.opentest4j:opentest4j:jar:1.2.0:test
[INFO] | | | \- org.junit.platform:junit-platform-commons:jar:1.6.3:test
[INFO] | | +- org.junit.jupiter:junit-jupiter-params:jar:5.6.3:test
[INFO] | | \- org.junit.jupiter:junit-jupiter-engine:jar:5.6.3:test
[INFO] | +- org.junit.vintage:junit-vintage-engine:jar:5.6.3:test
[INFO] | | +- org.apiguardian:apiguardian-api:jar:1.1.0:test
[INFO] | | +- org.junit.platform:junit-platform-engine:jar:1.6.3:test
[INFO] | | \- junit:junit:jar:4.13.1:test
[INFO] | +- org.mockito:mockito-core:jar:3.3.3:test
[INFO] | | +- net.bytebuddy:byte-buddy:jar:1.10.17:test
[INFO] | | +- net.bytebuddy:byte-buddy-agent:jar:1.10.17:test
[INFO] | | \- org.objenesis:objenesis:jar:2.6:test
[INFO] | +- org.mockito:mockito-junit-jupiter:jar:3.3.3:test
[INFO] | +- org.skyscreamer:jsonassert:jar:1.5.0:test
[INFO] | | \- com.vaadin.external.google:android-json:jar:0.0.20131108.vaadin1:test
[INFO] | +- org.springframework:spring-core:jar:5.2.10.RELEASE:compile
[INFO] | | \- org.springframework:spring-jcl:jar:5.2.10.RELEASE:compile
[INFO] | +- org.springframework:spring-test:jar:5.2.10.RELEASE:test
[INFO] | \- org.xmlunit:xmlunit-core:jar:2.7.0:test
[INFO] \- io.projectreactor:reactor-test:jar:3.3.11.RELEASE:test
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.620 s
[INFO] Finished at: 2020-11-10T12:55:22+03:00
[INFO] ------------------------------------------------------------------------
@buhtr I think there is some misunderstanding?
This issue is related to a performance regression in Reactor Netty 0.9.10 and the PR above fixes this regression.
@violetagg hard to say
I've done load testing for a different versions of scg in order to understand its performance. The results show that Greenwich.SR6 with default dependencies is 30% quicker then the latest version of scg. It seems odd to me. As a result i've created https://github.com/spring-cloud/spring-cloud-gateway/issues/2020 but @spencergibb closed it and marked it as duplicated by this task.
According to your answer it seems that these issues are not related. So what about 30% performance degradation is it appropriate for the latest version or is it some kind of a bug ?
@spencergibb Is it possible that the issue that is reported by @buhtr is a different one?
I think it could be a different issue. The problem reported in this issue is that sometimes the response is never received. I will check to see if the PR fixes it soon.
@tony-clarke-amdocs It will be really helpful if you can test that as Spring Boot releases are scheduled for this Thursday and I still have time to include that fix. If you think that a SNAPSHOT will be easier for testing I can commit that PR and thus make the fix available in the SNAPSHOT.
@violetagg Yes, adding this PR to SNAPSHOT will help. Thanks.
@tony-clarke-amdocs 0.9.14.BUILD-SNAPSHOT is available from the spring repo https://repo.spring.io/snapshot
Unfortunately we are still seeing timeouts in our NFT test with 0.9.14-BUILD-SNAPSHOT. Three runs and all had timeouts. Reverted back to SCG 2.3.2 and it passes.
@tony-clarke-amdocs can you check the jar that the fix is in, just to be sure that it is not a cached snapshot?
I am not testing the fix myself, but I agree, that is something I will forward on.
You can try directly with 0.9.14.RELEASE
You were correct about a cached SNAPSHOT. We now are getting clear runs without timeouts : )
@tony-clarke-amdocs Thanks a lot for testing it and providing the test case - it helped a lot.
@spencergibb Any idea when this fix will be made available in SCG? It's blocking us until then.
@tony-clarke-amdocs I don't understand. Spring Cloud doesn't manage the version of reactor or reactor netty, that is spring boot. Regardless, you can always adjust the version.
Closing as fixed in reactor-netty
@spencergibb Any idea when this fix will be made available in SCG? It's blocking us until then.
According to the schedule below Spring Boot 2.3.6 should happen tomorrow and Reactor Dysprosium-SR14 (Reactor Netty 0.9.14.RELEASE) is in. https://github.com/spring-projects/spring-boot/milestone/191
Before adopting Spring Cloud Gateway we run a load test to make sure all is good. Recently, when trying to adopt SCG 2.2.4 (and also SCG 2.2.5), this test began to fail. The load test uses Gatling to drive the load/requests.
20 minute duration 72 users/clients Ramp 5 users/sec Pace is 100 ms
The load test runs successfully with Spring Cloud Gateway 2.2.3. Starting with Spring Cloud Gateway 2.2.4, the load test fails with timeouts on the client like the following:
io.gatling.http.client.impl.RequestTimeoutException: Request timeout to apigw-nft.ilfnm108.eaas.amdocs.com/10.238.136.244:443 after 60000 ms
There are no errors in the gateway when the timeouts occur.
We compared the dependent component versions between Spring Cloud Gateway 2.2.3 and 2.2.4. To pinpoint the cause of the regression, we used Spring Cloud Gateway 2.2.3 and one by one upgraded the dependent components until the timeouts occurred during the load test. We found that the regression was introduced in io.projectreactor Dysprosium-SR10, specifically reactor-netty 0.9.10. We concluded this by using Spring Cloud Gateway 2.2.4 and the following versions of reactor-netty:
0.9.8: Exceptions occurred due to the following regression introduced in this version: This release contains a regression related to opened file descriptors, v0.9.9.RELEASE was released immediately after. 0.9.9: successfully ran NFT test with no timeouts 0.9.10: timeouts occur during the load test