Open waynexin opened 4 years ago
for some reason, the formatting didn't turn out well. Basically, I used webClient.getPage() to crawl and in the finally block did a webClient.close().
Can you please check with Version 2.37.0. I did some changes to make the closing of WebSockets more robust.
Sure. I'll give a try.
-Wayne
From: RBRi notifications@github.com Sent: Monday, March 2, 2020 6:03 PM To: HtmlUnit/htmlunit htmlunit@noreply.github.com Cc: waynexin wayne_xin@hotmail.com; Author author@noreply.github.com Subject: Re: [HtmlUnit/htmlunit] htmlunit 2.35 and 2.36 OSGi release (htmlunit-2.36.0-OSGi.jar) leaks threads and hangs the applications (#120)
Can you please check with Version 2.37.0. I did some changes to make the closing of WebSockets more robust.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/HtmlUnit/htmlunit/issues/120?email_source=notifications&email_token=AOIWG4WRGE4KNAC3GJGV7FLRFPYHJA5CNFSM4KGNGOIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENQKE4Y#issuecomment-593535603, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOIWG4VFLAHTLSABNDOP3DLRFPYHJANCNFSM4KGNGOIA.
I still see the same with 2.44. Not sure if this is related: https://stackoverflow.com/questions/46450721/how-do-you-close-websocketcontainer-websocketclient-jetty-client-in-java
I recently upgraded to 2.35 and 2.36 using htmlunit as a crawling unit. This was not happening for 2.33. After crawling a lot of pages, I started to see tons of the following threads in the thread dump and eventually it eats up system resource and hangs the container (in docker).
"WebSocketClient@1404341633-126276" #126276 daemon prio=5 os_prio=0 tid=0x00007faba4228800 nid=0xbe7 runnable [0x00007fa1ec102000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
"WebSocketClient@1404341633-126275" #126275 daemon prio=5 os_prio=0 tid=0x00007faba403b000 nid=0xbdb waiting on condition [0x00007fa28901c000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method)
The crawling code snippet looks like following. Because certain web access could get very slow, I create a future task for the crawling and "cancel" it.
..... ExecutorService executor = Executors.newSingleThreadExecutor(); Future future = executor.submit(new HtmlunitCrawl(urlWithProto, timeout, useProxy));
.....
} finally {
future.cancel(true);
executor.shutdownNow();
The crawling code:
.....