bbernhard / signal-cli-rest-api

Dockerized Signal Messenger REST API
https://bbernhard.github.io/signal-cli-rest-api/
MIT License
1.29k stars 154 forks source link

/v2/send json-rpc takes minutes after upgrade to v0.84 #545

Closed uriinf closed 3 months ago

uriinf commented 3 months ago

The problem

After upgrade v0.83 -> v0.84 /v2/send with MODE=json-rpc started working too slow. It takes minutes (up to 30m). In same time with MODE=native it takes seconds. Here are 2 examples:

Sending messages, MODE=json-rpc, v0.84

$ time curl -H "Content-Type: application/json" -d '{"message": "Test locally, mode: json-rpc, v0.84", "number": "+XXX", "recipients": ["+YYY"]}' 'http://172.17.0.1:8081/v2/send'
{"timestamp":"1718267865850"}
real    10m30.165s

$ docker logs signal-cli-rest -n10
[GIN] 2024/06/13 - 11:39:14 | 201 |         6m17s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 11:43:08 | 201 |        10m18s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 11:48:08 | 201 |        27m11s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 11:48:12 | 201 |        25m45s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 11:48:15 | 201 |        10m29s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 11:48:15 | 201 |        24m18s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 11:48:41 | 201 |        29m14s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 11:52:41 | 201 |        15m43s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 11:54:24 | 201 |        24m19s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 11:58:23 | 201 |        20m34s |      172.17.0.1 | POST     "/v2/send"

Sending messages, MODE=native, v0.84

$ time curl -H "Content-Type: application/json" -d '{"message": "Test locally, mode: native, v0.84", "number": "+XXX", "recipients": ["+YYY"]}' 'http://172.17.0.1:8081/v2/send'
{"timestamp":"1718269475657"}
real    0m5.377s

$ docker logs signal-cli-rest -n10
[GIN] 2024/06/13 - 12:49:15 | 201 |  29.64228493s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 12:49:26 | 201 |  4.940284465s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 12:49:35 | 201 |  4.814757048s |      172.17.0.1 | POST     "/v2/send"
[GIN] 2024/06/13 - 12:49:59 | 400 | 28.723654262s |      172.17.0.1 | POST     "/v2/send"
time="2024-06-13T12:49:59+03:00" level=info msg="INFO  SignalAccount - Config file is in use by another instance, waiting…"
time="2024-06-13T12:49:59+03:00" level=info msg="INFO  SignalAccount - Config file lock acquired."
[GIN] 2024/06/13 - 12:50:05 | 201 | 26.047650006s |      172.17.0.1 | POST     "/v2/send"
time="2024-06-13T12:50:05+03:00" level=info msg="INFO  SignalAccount - Config file is in use by another instance, waiting…"
time="2024-06-13T12:50:05+03:00" level=info msg="INFO  SignalAccount - Config file lock acquired."
[GIN] 2024/06/13 - 12:50:21 | 201 |  5.005876921s |      172.17.0.1 | POST     "/v2/send"

P.S.: I've tried to downgrade to v0.83 to make sure that it is started after upgrade - but internal database isn't compatible after migration to v0.84

I've tried to use debug - but found nothing useful. Also, I have another setup with different account and v0.83 MODE=json-rpc, send requests processed in seconds. Please suggest how to troubleshoot the problem. Thanks

Are you using the latest released version?

Have you read the troubleshooting page?

What type of installation are you running?

signal-cli-rest-api Docker Container

In which mode are you using the docker container?

JSON-RPC Mode

What's the architecture of your host system?

x86-64

Additional information

No response

bbernhard commented 3 months ago

Please switch to native mode and enable the debug mode (https://github.com/bbernhard/signal-cli-rest-api/blob/master/doc/DEBUG.md). It outputs you the signal-cli commands and instructions on how to call them directly (without the REST API wrapper involved). That way we can maybe see where the additional delay comes from.

uriinf commented 3 months ago

Please switch to native mode and enable the debug mode (https://github.com/bbernhard/signal-cli-rest-api/blob/master/doc/DEBUG.md). It outputs you the signal-cli commands and instructions on how to call them directly (without the REST API wrapper involved). That way we can maybe see where the additional delay comes from.

Thanks, just tried this. But I have delay issues with json-rpc only. Anyway, here is example for native and signal-cli

signal-api@51a145ebc5e3:/$ time echo 'CURL vs mode native v0.84' | signal-cli-native --config /home/.local/share/signal-cli -a +XXX send --message-from-stdin +YYY --notify-self
1718274170438

real    0m4.740s
user    0m0.795s
sys     0m0.241s
bbernhard commented 3 months ago

Ah sorry, misread that. Unfortunately there isn't an easy way yet to debug json-rpc commands. What you could do (if you are familiar with that) is, connect to the docker container and do a strace on the signal-cli process to see where the time is lost.

Please also switch back to json-rpc mode and enable debug logging again and post the (obfuscated) output here. Although debugging json-rpc calls isn't implemented yet, it still outputs some debug logging.

uriinf commented 3 months ago

Found nothing interesting listening main container process:

# strace -p 1
strace: Process 1 attached
futex(0x1a0cf60, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xc0000aa148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc0000aa148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1a0e998, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x1a0e898, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1a0e998, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1a0e898, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1a0e998, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x1a0e898, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1a0e970, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1a0e898, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1a0cf60, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x1a0e998, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x1a0e898, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x1a0cf60, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x1a0cf60, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
epoll_pwait(4, [], 128, 0, NULL, 0)     = 0
futex(0x1a0cf60, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
epoll_pwait(4, [], 128, 0, NULL, 0)     = 0
futex(0x1a0cf60, FUTEX_WAIT_PRIVATE, 0, NULL

But what is interesting - java process is constantly consuming whole CPU (99.6% here):

# ps aux|grep java
1000       1913 99.6  2.2 7667936 224616 ?      Sl   16:11  20:38 java -classpath /opt/signal-cli-0.13.4/lib/signal-cli-0.13.4.jar:/opt/signal-cli-0.13.4/lib/lib.jar:/opt/signal-cli-0.13.4/lib/bcprov-jdk18on-1.78.1.jar:/opt/signal-cli-0.13.4/lib/signal-service-java-2.15.3_unofficial_103.jar:/opt/signal-cli-0.13.4/lib/jackson-module-kotlin-2.17.1.jar:/opt/signal-cli-0.13.4/lib/jackson-annotations-2.17.1.jar:/opt/signal-cli-0.13.4/lib/jackson-core-2.17.1.jar:/opt/signal-cli-0.13.4/lib/jackson-databind-2.17.1.jar:/opt/signal-cli-0.13.4/lib/argparse4j-0.9.0.jar:/opt/signal-cli-0.13.4/lib/dbus-java-transport-native-unixsocket-5.0.0.jar:/opt/signal-cli-0.13.4/lib/jul-to-slf4j-2.0.13.jar:/opt/signal-cli-0.13.4/lib/logback-classic-1.5.6.jar:/opt/signal-cli-0.13.4/lib/dbus-java-core-5.0.0.jar:/opt/signal-cli-0.13.4/lib/sqlite-jdbc-3.46.0.0.jar:/opt/signal-cli-0.13.4/lib/HikariCP-5.1.0.jar:/opt/signal-cli-0.13.4/lib/slf4j-api-2.0.13.jar:/opt/signal-cli-0.13.4/lib/logback-core-1.5.6.jar:/opt/signal-cli-0.13.4/lib/jsr305-3.0.2.jar:/opt/signal-cli-0.13.4/lib/okhttp-4.12.0.jar:/opt/signal-cli-0.13.4/lib/wire-runtime-jvm-4.4.3.jar:/opt/signal-cli-0.13.4/lib/core-util-jvm-2.15.3_unofficial_103.jar:/opt/signal-cli-0.13.4/lib/kotlin-stdlib-jdk7-1.9.20.jar:/opt/signal-cli-0.13.4/lib/okio-jvm-3.9.0.jar:/opt/signal-cli-0.13.4/lib/kotlin-reflect-1.9.20.jar:/opt/signal-cli-0.13.4/lib/kotlin-stdlib-1.9.21.jar:/opt/signal-cli-0.13.4/lib/kotlin-stdlib-jdk8-1.9.20.jar:/opt/signal-cli-0.13.4/lib/libphonenumber-8.13.23.jar:/opt/signal-cli-0.13.4/lib/libsignal-client-0.47.0.jar:/opt/signal-cli-0.13.4/lib/rxjava-3.0.13.jar:/opt/signal-cli-0.13.4/lib/reactive-streams-1.0.3.jar:/opt/signal-cli-0.13.4/lib/annotations-13.0.jar org.asamk.signal.Main --output=json --config /home/.local/share/signal-cli/ jsonRpc
uriinf commented 3 months ago

Please also switch back to json-rpc mode and enable debug logging again and post the (obfuscated) output here.

time="2024-06-13T16:38:49+03:00" level=debug msg="json-rpc command: {\"jsonrpc\":\"2.0\",\"method\":\"send\",\"id\":\"8bb5d0ef-891e-44b8-adce-2d98671e93b1\",\"params\":{\"recipient\":[\"+XXX\"],\"message\":\"CURL vs mode json-rpc v0.84\",\"notify-self\":true,\"account\":\"+YYY\"}}"
time="2024-06-13T16:41:58+03:00" level=debug msg="json-rpc received data: {\"jsonrpc\":\"2.0\",\"result\":{\"timestamp\":1718285929392,\"results\":[{\"recipientAddress\":{\"uuid\":\"005ba546-2683-4b1f-aae6-50c20ca07dde\",\"number\":\"+XXX\"},\"type\":\"SUCCESS\"}]},\"id\":\"8bb5d0ef-891e-44b8-adce-2d98671e93b1\"}\n"
[GIN] 2024/06/13 - 16:41:58 | 201 |          3m8s |     10.10.10.65 | POST     "/v2/send"
bbernhard commented 3 months ago

Okay, it is a bit cumbersome, but could you try the following:

Inside the docker container, run:

Replace XXX and YYY accordingly. This removes the REST API layer completely and sends the message via signal-cli directly.

uriinf commented 3 months ago

I have to start container with /bin/bash as entrypoint because killing of master process is causing container restart. Anyway, behavior is similar - sending takes minutes and java process consuming 100% CPU. Here is log:

$ docker exec -it signal-cli-rest /bin/bash
# su - signal-api
$ signal-cli --output=json --config /home/.local/share/signal-cli jsonRpc
{"jsonrpc":"2.0","method":"send","id":"8bb5d0ef-891e-44b8-adce-2d98671e93b1","params":{"recipient":["+RECIPIENT"],"message":"CURL vs mode json-rpc v0.84","notify-self":true,"account":"+ACCOUNT"}}
{"jsonrpc":"2.0","method":"receive","params":{"exception":{"message":"org.signal.libsignal.protocol.InvalidMessageException: invalid Whisper message: decryption failed","type":"ProtocolInvalidMessageException"},"envelope":{"source":"ANOTHER_NUMBER","sourceNumber":"ANOTHER_NUMBER","sourceUuid":"79f4d969-c699-45ca-9573-172af79d96b1","sourceName":"????? ?.","sourceDevice":1,"timestamp":1715959293759},"account":"+ACCOUNT"}}
{"jsonrpc":"2.0","result":{"timestamp":1718700182064,"results":[{"recipientAddress":{"uuid":"005ba546-2683-4b1f-aae6-50c20ca07dde","number":"+RECIPIENT"},"type":"SUCCESS"}]},"id":"8bb5d0ef-891e-44b8-adce-2d98671e93b1"}
{"jsonrpc":"2.0","result":{"timestamp":1718700247577,"results":[{"recipientAddress":{"uuid":"005ba546-2683-4b1f-aae6-50c20ca07dde","number":"+RECIPIENT"},"type":"SUCCESS"}]},"id":"8bb5d0ef-891e-44b8-adce-2d98671e93b1"}
{"jsonrpc":"2.0","method":"receive","params":{"exception":{"message":"org.signal.libsignal.protocol.InvalidMessageException: invalid Whisper message: decryption failed","type":"ProtocolInvalidMessageException"},"envelope":{"source":"ANOTHER_NUMBER","sourceNumber":"ANOTHER_NUMBER","sourceUuid":"79f4d969-c699-45ca-9573-172af79d96b1","sourceName":"????? ?.","sourceDevice":1,"timestamp":1715959293793},"account":"+ACCOUNT"}}

I suspect that something wrong either with local database or Signal account, maybe some huge history, etc.. Will try to remove local data and link signal-api from scratch.

bbernhard commented 3 months ago

yeah, that sounds likely. Yes, please try to start from scratch to see if that issue is still there. If so, then I think it would be best to open an issue in the signal-cli repository.

uriinf commented 3 months ago

Will try to remove local data and link signal-api from scratch.

This is worked. Sending in json-rpc mode takes seconds again. Will open an issue in signal-cli if the problem appears again.