web-push-libs / webpush-java

Web Push library for Java
MIT License
315 stars 113 forks source link

Thread leak - too many open files #196

Open luccotta opened 2 years ago

luccotta commented 2 years ago

Hello, I scheduled this service to be executed every 15 min in a new thread, to send 1 to 10 notifications on average. For each notification, I create a new PushService and call pushService.send(notification). Is that ok? Should I be doing something differently like reusing the same PushService?

The threads are not getting terminated, eventually my server exceeds the maximum allowed open files in Linux, and I start getting SocketException: Too many open files everywhere. All threads get stuck with the following stack trace:

I/O dispatcher 37368" #48812 prio=5 os_prio=0 tid=0x00007f6898760800 nid=0xe3a4 runnable [0x00007f64adbe4000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
    at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
    at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
    at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
    - locked <0x00007f82867201f0> (a sun.nio.ch.Util$3)
    - locked <0x00007f82867201d8> (a java.util.Collections$UnmodifiableSet)
    - locked <0x00007f8286720010> (a sun.nio.ch.EPollSelectorImpl)
    at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
    at java.lang.Thread.run(Thread.java:750)

I know this is from webpush-java because it's the only lib in my project using httpcore-nio. Also I used this tool to investigate what was opening so many files and could see this stack from webpush-java:

Opened selector by thread:pool-6-thread-1 on Mon Jul 18 23:13:23 BRT 2022
        at java.nio.channels.spi.AbstractSelector.<init>(AbstractSelector.java:86)
        at sun.nio.ch.SelectorImpl.<init>(SelectorImpl.java:54)
        at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:64)
        at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36)
        at java.nio.channels.Selector.open(Selector.java:227)
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.<init>(AbstractMultiworkerIOReactor.java:142)
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.<init>(DefaultConnectingIOReactor.java:82)
        at org.apache.http.impl.nio.client.IOReactorUtils.create(IOReactorUtils.java:43)
        at org.apache.http.impl.nio.client.HttpAsyncClientBuilder.build(HttpAsyncClientBuilder.java:667)
        at org.apache.http.impl.nio.client.HttpAsyncClients.createSystem(HttpAsyncClients.java:70)
        at nl.martijndwars.webpush.PushService.sendAsync(PushService.java:162)
        at nl.martijndwars.webpush.PushService.send(PushService.java:142)
        at nl.martijndwars.webpush.PushService.send(PushService.java:146)

Thanks

luccotta commented 1 year ago

I found out that this is happening when the exception bellow happens. Every time it does, around 50 file descriptors are opened and never closed. I tried to test this solution which makes a lot of sense, but apparently it did not solve the issue.

2022-07-27 10:36:53 ERROR SendPushNotification:
java.util.concurrent.ExecutionException: java.net.ConnectException: Connection timed out
        at org.apache.http.concurrent.BasicFuture.getResult(BasicFuture.java:71)
        at org.apache.http.concurrent.BasicFuture.get(BasicFuture.java:84)
        at org.apache.http.impl.nio.client.FutureWrapper.get(FutureWrapper.java:70)
        at nl.martijndwars.webpush.PushService.send(PushService.java:64)
        at nl.martijndwars.webpush.PushService.send(PushService.java:68)
        at ....
Caused by: java.net.ConnectException: Connection timed out
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:174)
        at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:148)
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:351)
        at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:221)
        at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64)
        ... 1 more