openimsdk / open-im-server

IM Chat
https://openim.io
Apache License 2.0
13.94k stars 2.45k forks source link

[BUG] Intermittent Errors and Message Delivery Issues #2710

Closed zhaolibo1989 closed 24 minutes ago

zhaolibo1989 commented 2 days ago

OpenIM Server Version

3.8.0

Operating System and CPU Architecture

Linux (AMD)

Deployment Method

Source Code Deployment

Bug Description and Steps to Reproduce

Subject: [BUG] Intermittent Error Log and Online Message Delivery Issues with Additional Logs

Hello Team,

I hope this message finds you well. I'm writing to report a bug that I've encountered with the Open-IM-Server project. This issue is intermittent and seems to resolve itself after multiple restarts and tests. However, I haven't been able to identify the specific conditions that trigger it. I've included additional logs that may help in diagnosing the problem.

Error Log Details:

2024-10-12 15:40:36.538 ERROR   [PID:2740202]   openim-msggateway               [version:3.8.0]         [msggateway/online.go:91]                               update user online status                               {"operationID": "p_2740202_64", "error": "14 last resolver error: produced zero addresses"}

Additional Logs:

2024-10-12 15:40:36.538 DEBUG   [PID:2740202]   openim-msggateway               [version:3.8.0]         [mw/rpc_client_interceptor.go:44]                       RPC Client Request - setUserOnlineStatus                {"operationID": "p_2740202_64", "funcName": "/openim.user.user/setUserOnlineStatus", "req": "status:{userID:\"10004\" offline:2}", "conn target": "openim:///user"}
2024-10-12 15:40:36.538 ERROR   [PID:2740202]   openim-msggateway               [version:3.8.0]         [mw/rpc_client_interceptor.go:50]                       RPC Client Response Error - setUserOnlineStatus         {"operationID": "p_2740202_64", "funcName": "/openim.user.user/setUserOnlineStatus", "error": "rpc error: code = Unavailable desc = last resolver error: produced zero addresses"}

Additional Observed Behavior:

Environment:

Configuration File (config/discovery.yml):

enable: "zookeeper"
etcd:
  rootDirectory: openim
  address: [ localhost:12379 ]
  username: ''
  password: ''

zookeeper:
  schema: openim
  address: [ localhost:12181 ]
  username: ''
  password: ''

Observations:

Expected Behavior:

Additional Information:

Additional Information from ZooKeeper: Upon inspecting the ZooKeeper instances for user information when the issue occurred, everything appeared to be in order. Here are the details from the ZooKeeper inspection:

[zk: localhost:2181(CONNECTED) 1] ls /
[openim, zookeeper]

[zk: localhost:2181(CONNECTED) 2] ls /openim
[auth, conversation, friend, group, messageGateway, msg, push, third, user]

[zk: localhost:2181(CONNECTED) 3] ls /openim/user
[_c_094db5b509635cc5977b4fd3a19b7d77-172.16.128.207:10110_0000000034]

[zk: localhost:2181(CONNECTED) 4] ls /openim/third
[_c_3d77825639b1da08a653af8d46359bb1-172.16.128.207:10190_0000000034]

[zk: localhost:2181(CONNECTED) 5] ls /openim/push
[_c_dc5344398ff94dac1eb555ba8f646aea-172.16.128.207:10170_0000000034]

[zk: localhost:2181(CONNECTED) 6] ls /openim/msg
[_c_9ad223c13cc9ccb4c48949234ad46bcd-172.16.128.207:10130_0000000034]

[zk: localhost:2181(CONNECTED) 7] ls /openim/auth
[_c_30a13216ec492924de1abc13d8f6d372-172.16.128.207:10160_0000000034]

[zk: localhost:2181(CONNECTED) 9] get /openim/user/_c_094db5b509635cc5977b4fd3a19b7d77-172.16.128.207:10110_0000000034
172.16.128.207:10110

The ZooKeeper nodes and their corresponding values were retrieved successfully, and no anomalies were detected in the user-related data. However, the presence of the error log and the inability of users to receive online messages suggest that there might be an issue with the service's interaction with ZooKeeper or other underlying components.

I would appreciate any guidance on how to proceed or what additional information you need from me to help resolve this issue.

Thank you for your attention to this matter. I look forward to your response.

Best regards, Libo

Screenshots Link

No response

zhaolibo1989 commented 2 days ago

As the mage check shows, everything is OK:

root@ubuntu:~/openim/open-im-server# mage check
[2024-10-12 16:54:33 CST] All services are running normally.
[2024-10-12 16:54:33 CST] Display details of the ports listened to by the service:
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769072 is listening on ports: 20108
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 1 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769083 is listening on ports: 20109
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 2 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769090 is listening on ports: 20110
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 3 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769098 is listening on ports: 20111
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-crontask -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769054 is not listening on any ports.
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-conversation -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769105 is listening on ports: 10180, 20105
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-group -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769123 is listening on ports: 10150, 20103
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-push -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769061 is listening on ports: 20107, 10170
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-third -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769127 is listening on ports: 10190, 20101
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-user -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769057 is listening on ports: 10110, 20100
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-auth -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769116 is listening on ports: 10160, 20106
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-friend -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769144 is listening on ports: 20104, 10120
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-api -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769113 is listening on ports: 10002, 20113
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-msggateway -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769055 is listening on ports: 20112, 10001, 10140
[2024-10-12 16:54:34 CST] Cmdline: /home/ubuntu/openim/open-im-server/_output/bin/platforms/linux/amd64/openim-rpc-msg -i 0 -c /home/ubuntu/openim/open-im-server/config/, PID: 2769056 is listening on ports: 10130, 20102
skiffer-git commented 23 hours ago

Change enable: "zookeeper" to enable: "etcd".

OpenIM-Robot commented 23 hours ago

Bot detected the issue body's language is not English, translate it automatically. πŸ‘―πŸ‘­πŸ»πŸ§‘β€πŸ€β€πŸ§‘πŸ‘«πŸ§‘πŸΏβ€πŸ€β€πŸ§‘πŸ»πŸ‘©πŸΎβ€πŸ€β€πŸ‘¨πŸΏπŸ‘¬πŸΏ


Change enable: "zookeeper" to enable: "etcd".

skiffer-git commented 23 hours ago

There's an issue with "zookeeper," and we've removed it.