quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.82k stars 2.69k forks source link

RestClient - DNS Round Robin not possible #26409

Closed Chexpir closed 2 years ago

Chexpir commented 2 years ago

Describe the bug

We are running dozens of services in Quarkus. We have recently migrated a service from Micronaut to Quarkus, and we have realised that we cannot apply DNS Load Balancing using Quarkus RestClient.

It appears that Quarkus Rest client uses "DNS Pinning", whereas micronaut and grails are not doing it.

It happens with both resteasy classic and reactive resteasy.

We have tried to solve it:

This effectively means that we cannot use autoscale.

We are going to try to use Stork, but that's not a solution, and there are chances that it suffers from the same issue.

Expected behavior

Rotation across the DNS IPs when doing requests.

Actual behavior

Same IP (machine) is hit continuously, unless the service is restarted.

Output of java -version

Java 17

GraalVM version (if different from Java)

No need to.

Quarkus version or git rev

2.10.0.Final

Build tool (ie. output of mvnw --version or gradlew --version)

Gradle.

geoand commented 2 years ago

@cescoffier does Vert.x address this somehow?

quarkus-bot[bot] commented 2 years ago

/cc @michalszynkiewicz

cescoffier commented 2 years ago

First guess: disable keep-alive

michalszynkiewicz commented 2 years ago

I'll take a look at it

cescoffier commented 2 years ago

It might not be keep-alive, as the issue mentioned having the problem with both reactive and classic. It might really be a DNS caching issue.

cescoffier commented 2 years ago

Stork would help if you provide your own Stork discovery which would resolve the IPs.

Chexpir commented 2 years ago

thanks everybody, I am running through the keep-alive option. @cescoffier , any quarkus/microprofile/vertx component contains DNS responsibilities? I guess the only one is vertx, so I'll also test quarkus.vertx.use-async-dns=true and let you know the results.

We want to run out of options before moving to Stork/ConfigMap.

I confirm that @RequestScoped on the RestClient interface did not help, which was surprising.

cescoffier commented 2 years ago

No, that resolution is managed by the JVM. Nothing Quarkus specific.

michalszynkiewicz commented 2 years ago

@Chexpir to make sure I understand, your DNS server returns different IP addresses on different calls and you'd like the client to query DNS on each call to leverage that?

Or does it return a list of IPs and you'd like the client to round-robin between them?

Chexpir commented 2 years ago

We have several type A registries in our DNS server with different IP addresses (e.g. when an instance is created/destroyed in our AWS ECS, our DNS registry gets updated), same as httpbin (see image below) so our DNS returns multiple IPs (order of the IPs is randomized) and we would like the client to round-robin (or something less sofisticated) among them yes.

Currently It's not only that it does not do round robin. It's also that if a new entry appears in the DNS, it's not picked up ever. If an entry dissapears, the service will fail the request (IP and service not reachable) and only then update the IP that it's using.

micronaut (java 11) and grails (java 8) are doing it (sorry for not listing the underlying technologies or subversions, if they become relevant, I can).

image

Chexpir commented 2 years ago

Nothing worked, so we built stork (static) and didn't work, but we used stork+CloudMap and it worked perfectly, reacting quickly to autoscale, so thanks everybody, we'll go with the workaround in the lack of other options without stork.

michalszynkiewicz commented 2 years ago

WDYM by stork + config map? Do you feed the IPs directly?

Chexpir commented 2 years ago

We created a Custom Service Discovery in Stork that performs calls to AWS Cloudmap. These calls are HTTP based but could also be DNS queries. We get a list of valid/healthy IP+Port from there. It's just a PoC at the moment

michalszynkiewicz commented 2 years ago

I did some experiments with DNS and -Dnetworkaddress.cache.ttl=0 sort of works with rest client reactive but it still doesn't keep the resolution mechanism from caching for a short amount of time. So if you want real round-robin, I think Stork is the way to go.

Would you be open to contribute the AWS Cloudmap service discovery to Stork? Or to deploy it to Maven Central and to share the coordinates with us to let Stork users know about it?

michalszynkiewicz commented 2 years ago

I created a PR to add DNS service discovery to Stork. Would greatly appreciate it if you could check if it also works for your set up.

michalszynkiewicz commented 2 years ago

The new release of Stork, 1.2.0 has a DNS service discovery. When Quarkus gets updated to it, you should be able to do DNS + round robin

michalszynkiewicz commented 2 years ago

CC @aureamunoz