quarkus-qe / quarkus-test-suite

16 stars 34 forks source link

Disable ReactiveRestClientProxyIT on aarch64 #2112

Closed mocenas closed 1 month ago

mocenas commented 1 month ago

Summary

Disable ReactiveRestClientProxyIT on aarch. The test tries to connect to domain example.com and this domain return 499 (token required) on arch. We are probably hitting some threshold.

It is IMHO generally bad idea to use domain we don't own/control in our TS. But it is going OK for other nodes, AFAIK fails only on aarch so disabling it only on it.

Filed an issue for this - https://github.com/quarkus-qe/quarkus-test-suite/issues/2111

Please select the relevant options.

Checklist:

mocenas commented 1 month ago

cc @fedinskiy

mjurc commented 1 month ago

I don't really want to disable the test without any RCA/improvement - why would this fail only on aarch64 machines, when they're in the same network as the other bare metal ones?

I don't see the example domain anywhere in test, but even so, cannot we use anything else?

mocenas commented 1 month ago

I don't really want to disable the test without any RCA/improvement - why would this fail only on aarch64 machines, when they're in the same network as the other bare metal ones?

No idea why this is just affecting aarch. Also doing just ping example.com from aarch node is going OK. I can just say that it is happening consistently in tests.

I don't see the example domain anywhere in test, but even so, cannot we use anything else?

https://github.com/quarkus-qe/quarkus-test-suite/blob/main/http/rest-client-reactive/src/main/resources/proxy.properties#L1 We can use some other domain. But would it actually help? We can encounter the same problem next day. Solution would be to create our own another service and test against that. But I didn't want to be stuck on this right now.

rsvoboda commented 1 month ago

Disabling should happen when there is no other way or alternative.

Please try the options mentions in your comment @mocenas

mocenas commented 1 month ago

First of - I tried to change this test to use local service instead of public example.com and failed with that. In this test we use nginx proxy to proxy some http communication against some http server. This proxy is running in docker and it has a big problem connecting to localhost services. (127.0.0.1 inside docker is something different than on machine).

Main thing - I was wrong in the root source. After more debugging turned out problem is DNS. Nginx needs to have specifically configured DNS server. We have "8.8.8.8" in it's config. Turns out that our infra is blocking any outgoing DNS queries. Even manually I was unable to resolve anything against any public DNS server, from aarch machine.

Unfortunately nginx+docker doesn't have any default DNS resolver and must have one configured. Accessing local resolver is also problematic. Docker should have a resolver accessible on 127.0.0.11 but ends with send() failed (111: Connection refused) while resolving, resolver: 127.0.0.11:53 from what I found online, you need to somehow setup additional docker network and configure it (I didn't analyze that too far).

IMHO best solution would be if we can enable DNS queries from aarch machines outside. WDYT @mjurc ?

mocenas commented 1 month ago

I'm closing this one, since we had a successful run of this test.