Turasa / libsignal-service-java

GNU General Public License v3.0
37 stars 21 forks source link

SignalServiceMessagePipe shutdown() does not clean up threads #25

Closed hoehermann closed 3 years ago

hoehermann commented 3 years ago

Hi

At my project purple-signal, I am trying to wrap libsignal-service-java into a C plug-in for use in Pidgin. Most of the Java code is taken from signal-cli. It is working quite okay so far, but upon the main application (Pidgin) termination, it takes two minutes until the JVM is unloaded and the process can finish.

This could be related to square/okhttp/issues/5542, but that issue has been marked as "resolved". I am using okhttp-4.6.0. This is what I do (outline):

WebSocketConnection webSocketConnection = new WebSocketConnection(…); // creates and wraps okHttpClient and WebSocket objects
SignalServiceMessagePipe messagePipe = new SignalServiceMessagePipe(webSocketConnection, …); // wraps connection
// do some messaging
messagePipe.shutdown(); // calls WebSocket.close(…);

This is how the logs look like:

[Shutdown requested]
(20:58:39) Waiting for all remaining Java threads…
(20:58:39) Java threads: 'main' 'OkHttp textsecure-service.whispersystems.org' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog' 'OkHttp https://textsecure-service.whispersystems.org/...' 'Thread-1'
(20:58:40) Java threads: 'main' 'OkHttp textsecure-service.whispersystems.org' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog' 'Thread-1'
[47 seconds pass…]
(20:59:27) Java threads: 'main' 'OkHttp textsecure-service.whispersystems.org' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog' 'Thread-1'
(20:59:28) Java threads: 'main' 'OkHttp textsecure-service.whispersystems.org' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog'
[a few seconds pass…]
(20:59:31) Java threads: 'main' 'OkHttp textsecure-service.whispersystems.org' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog'
(20:59:32) Java threads: 'main' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog'
[a few seconds pass…]
(20:59:38) Java threads: 'main' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog'
(20:59:39) Java threads: 'main' 'OkHttp TaskRunner' 'OkHttp TaskRunner'
[52 seconds pass…]
(21:00:31) Java threads: 'main' 'OkHttp TaskRunner' 'OkHttp TaskRunner'
(21:00:32) Java threads: 'main' 'OkHttp TaskRunner'
[a few seconds pass…]
(21:04:31) Java threads: 'main' 'OkHttp TaskRunner'
(21:04:32) Java threads: 'main'
[application finally terminates]

I do not see which of these is true:

Any ideas on this are appreciated. Having a two minute delay after program quit is really annoying and an awful user experience.

Regards

AsamK commented 3 years ago

I had already fixed an issue that sounds similar, some time ago: AsamK/signal-cli#263 Not sure if you're using it wrong or if it's a regression ... You could try to remove the System.exit call in signal-cli to check if the current signal-cli is affected by the same issue.

hoehermann commented 3 years ago

The behaviour looks the same. Upon investigating this issue, I re-read the documentation. It explicitly states that the VM will wait for all threads to terminate. However, this applies only to non-daemon threads and the lingering 'OkHttp TaskRunner' threads are daemon threads. Apparently, I am doing something wrong somewhere else. I am closing this issue now until I feel confident enough to open it up again.

Just for reference, I added these lines before System.exit():

Thread[] list = null;
while (list == null || list.length > 1) {
  ThreadGroup tg = Thread.currentThread().getThreadGroup();
  list = new Thread[tg.activeCount()];
  tg.enumerate(list); 
  StringBuilder sb = new StringBuilder("Java threads:");
  for (Thread t : list) {
    sb.append(" '");
    sb.append(t.getName());
    sb.append("'");
  }
  System.err.println(sb.toString());
  Thread.sleep(1000);
}

The output reads:

$ ./gradlew run --args='-u [REDACTED] receive'

> Task :run
Envelope from: “null” [REDACTED] (device: 1)
Timestamp: 1594126824179 (2020-07-07T13:00:24.179Z)
Sender: “null” [REDACTED] (device: 1)
Message timestamp: 1594126824179 (2020-07-07T13:00:24.179Z)
Body: wie ist das mit chats?
Group info:
  Id: [REDACTED]
  Name: Test Group
  Type: DELIVER
Profile key update, key length:32

Envelope from: “null” [REDACTED] (device: 1)
Timestamp: 1594126859263 (2020-07-07T13:00:59.263Z)
Sender: “null” [REDACTED] (device: 1)
Message timestamp: 1594126859263 (2020-07-07T13:00:59.263Z)
Group info:
  Id: [REDACTED]
  Name: Test Group
  Type: UPDATE
  Member: [REDACTED]
  Member: [REDACTED]
  Member: [REDACTED]

Envelope from: “null” [REDACTED] (device: 1)
Timestamp: 1603644439944 (2020-10-25T16:47:19.944Z)
Sender: “null” [REDACTED] (device: 1)
Message timestamp: 1603644439944 (2020-10-25T16:47:19.944Z)
Body: Vodafone ist jetzt auch dabei.
Group info:
  Id: [REDACTED]
  Name: Test Group
  Type: DELIVER
Profile key update, key length:32

Java threads: 'main' 'OkHttp textsecure-service.whispersystems.org' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'OkHttp WebSocket https://textsecure-service.whispersystems.org/... writer' 'Okio Watchdog' 'OkHttp https://textsecure-service.whispersystems.org/...' 'Thread-0'
Java threads: 'main' 'OkHttp textsecure-service.whispersystems.org' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog' 'Thread-0'
[35 seconds pass]
Java threads: 'main' 'OkHttp textsecure-service.whispersystems.org' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog' 'Thread-0'
Java threads: 'main' 'OkHttp textsecure-service.whispersystems.org' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog'
Java threads: 'main' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog'
Java threads: 'main' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'OkHttp TaskRunner' 'Okio Watchdog'
Java threads: 'main' 'OkHttp TaskRunner' 'OkHttp TaskRunner'
[50 seconds pass]
Java threads: 'main' 'OkHttp TaskRunner' 'OkHttp TaskRunner'
Java threads: 'main' 'OkHttp TaskRunner'
[a few seconds pass]
Java threads: 'main' 'OkHttp TaskRunner'
Java threads: 'main'
[application terminates]
hoehermann commented 3 years ago

Just a heads-up in case anyone else ever stumbles across this: I am not perfectly confident, but I think I have found a source of the problem. I was sending messages from native code. In the native code, I forgot to call DeleteLocalRef on one of the strings (e.g. message text). It looks like threads in the Java VM wait for all references to be released before DestroyJavaVM can succeed.