smallrye / smallrye-graphql

Implementation for MicroProfile GraphQL
Apache License 2.0
154 stars 88 forks source link

Failed to create a new resolver - Maximum number of datagram sockets reached #2121

Open adarsh0048 opened 1 month ago

adarsh0048 commented 1 month ago

This is an intermittent issue which occurs for small rye library, when used with graphql.

Do you have any suggestions ? Some error logs as below :

Caused by: java.lang.IllegalStateException: failed to create a new resolver
    at io.netty.resolver.AddressResolverGroup.getResolver(AddressResolverGroup.java:72)
    at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:208)
    at io.netty.bootstrap.Bootstrap.doResolveAndConnect(Bootstrap.java:171)
    at io.netty.bootstrap.Bootstrap.connect(Bootstrap.java:148)
    at io.vertx.core.net.impl.ChannelProvider.handleConnect(ChannelProvider.java:152)
    at io.vertx.core.net.impl.ChannelProvider.connect(ChannelProvider.java:103)
    at io.vertx.core.net.impl.ChannelProvider.connect(ChannelProvider.java:89)
    at io.vertx.core.net.impl.NetClientImpl.connectInternal2(NetClientImpl.java:309)
    at io.vertx.core.net.impl.NetClientImpl.lambda$connectInternal2$7(NetClientImpl.java:329)
    at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    ... 1 more
Caused by: io.netty.channel.ChannelException: Failed to open a socket.
    at io.netty.channel.socket.nio.NioDatagramChannel.newSocket(NioDatagramChannel.java:91)
    at io.netty.channel.socket.nio.NioDatagramChannel.<init>(NioDatagramChannel.java:120)
    at io.vertx.core.impl.transports.JDKTransport.datagramChannel(JDKTransport.java:40)
    at io.vertx.core.impl.resolver.DnsResolverProvider.lambda$new$1(DnsResolverProvider.java:107)
    at io.netty.bootstrap.AbstractBootstrap.initAndRegister(AbstractBootstrap.java:326)
    at io.netty.bootstrap.AbstractBootstrap.doBind(AbstractBootstrap.java:288)
    at io.netty.bootstrap.AbstractBootstrap.bind(AbstractBootstrap.java:284)
    at io.netty.resolver.dns.DnsNameResolver.<init>(DnsNameResolver.java:513)
    at io.netty.resolver.dns.DnsNameResolverBuilder.build(DnsNameResolverBuilder.java:570)
    at io.netty.resolver.dns.DnsAddressResolverGroup.newNameResolver(DnsAddressResolverGroup.java:114)
    at io.netty.resolver.dns.DnsAddressResolverGroup.newResolver(DnsAddressResolverGroup.java:92)
    at io.netty.resolver.dns.DnsAddressResolverGroup.newResolver(DnsAddressResolverGroup.java:77)
    at io.netty.resolver.AddressResolverGroup.getResolver(AddressResolverGroup.java:70)
    ... 16 more
Caused by: java.net.SocketException: maximum number of DatagramSockets reached
    at java.base/sun.net.ResourceManager.beforeUdpCreate(ResourceManager.java:72)
    at java.base/sun.nio.ch.DatagramChannelImpl.<init>(DatagramChannelImpl.java:131)
    at java.base/sun.nio.ch.SelectorProviderImpl.openDatagramChannel(SelectorProviderImpl.java:42)
    at io.netty.channel.socket.nio.NioDatagramChannel.newSocket(NioDatagramChannel.java:89)
    ... 28 more
Reason: failed to create a new resolver
jmartisk commented 1 month ago

This doesn't look like anything that SmallRye GraphQL can do something about (unless you have some kind of reproducer that can prove me wrong). Sounds like you're hitting some limitations set by your operating system.

adarsh0048 commented 1 month ago

@jmartisk This isn't a one time issue. It is an intermittent issue, happens sometimes. It occurs on different os as well.

The scenario is, if run for 'n' times the GraphQL query or mutations work fine, but if run for n+1 time, this issue occurs and then keeps happening for n+1, n+2, n+3... time runs. Here, n could be anywhere from 10 to 50. The only workaround here is to then restart the system and then it works fine again.

Could there be some connections which are not closed on SmallRye GraphQL side (got missed) ? What I get from the error is Failed to create a resolver because there wasn't a socket opened. Just looking for some suggestions, thanks!

jmartisk commented 1 month ago

I assume this is using the GraphQL client, not server, correct? From the stack trace, it looks like it's trying to connect to something. The server side just creates one server socket, so there's little reason for this to happen

jmartisk commented 1 month ago

If that's the case, I would suspect that you're creating a lot of GraphQL clients without properly closing them. What is your application using? Is it based on Quarkus? Does it use clients managed by Quarkus and CDI, or do you create them using a builder?

adarsh0048 commented 1 month ago

I am using io.smallrye.graphql.client.dynamic.api.DynamicGraphQLClientBuilder and io.smallrye.graphql.client.vertx.dynamic.VertxDynamicGraphQLClientBuilder to create clients.

I am using close() method to close the clients from AutoCloseable interface. So, interface DynamicGraphQLClient extends AutoCloseable, so it should correctly close the clients right ?

Do you have any suggestions what I should check next or are there any methods from small rye which help in closing the clients correctly ?

jmartisk commented 1 month ago

You need to make sure you either call close() on all of them, or always use a try-with-resources block for creating the client.

But maybe also consider simply not creating so many of them - I don't know how many different remote GraphQL endpoints your application calls, but if you only call one endpoint, you should probably just reuse a single client instance. Then you won't need to care about closing it.

If using Quarkus, I'd suggest using CDI and injecting an instance that is configured by the quarkus.smallrye-graphql-client.url property - then Quarkus will create a single client instance pointing at that URL and manage it for you, so again, no need to care about closing it.

adarsh0048 commented 1 month ago

yes thanks, I did check that I am calling close() on all of them.

Does quarkus replace vertex in terms of code or is an addition ?

Currently, I am doing something like this :

vertx = Vertx.vertx();
DynamicGraphQLClientBuilder builder = new VertxDynamicGraphQLClientBuilder()
            .url(endpoint)
            .options(toWebClientOptions(graphQLOptions))
            .vertx(vertx);
 client = builder.build();

Then, later on, I am calling :

client.close();
vertex.close();
jmartisk commented 1 month ago

Ah so I assume you're not using Quarkus (the runtime framework that integrates smallrye-graphql-client). Vert.x is the HTTP client toolkit that is used by Quarkus, but can be also used standalone, which is probably your case.

One problem may be that Vertx.vertx() always creates a new instance of Vert.x, which is quite resource-intensive, and maybe even though you close them, it may still contribute to the problem. Maybe try to call Vertx.vertx() just once at your application start, and store the result somewhere where you can access it, and pass that single instance to all your usages of the VertxDynamicGraphQLClientBuilder, and see if that helps.

adarsh0048 commented 1 month ago

Ok, thanks, let me try that.

jmartisk commented 2 weeks ago

@adarsh0048 did it help?

adarsh0048 commented 2 weeks ago

@jmartisk I haven't fully tested it, let me check some more and get back to you, thanks !