openimsdk / open-im-server

IM Chat ChatGPT
https://openim.io
Apache License 2.0
14.14k stars 2.5k forks source link

[BUG] last resolver error: produced zero addresses #2529

Open happy2wh666 opened 3 months ago

happy2wh666 commented 3 months ago

OpenIM Server Version

release-v3.8

Operating System and CPU Architecture

Linux (AMD)

Deployment Method

Source Code Deployment

Bug Description and Steps to Reproduce

每分钟一次调用api /auth/user_token。启动server后,调用正常。过一段时间后就会出错。

2024-08-18 17:07:19.831 | 2024-08-18 09:07:19.768   ERROR   [PID:3693]      openim-api                  [version:3.8.0]     [mw/rpc_client_interceptor.go:50]                   RPC Client Response Error - userToken               {"operationID": "8b2f17f7792ecd1e", "funcName": "/openim.auth.Auth/userToken", "error": "rpc error: code = Unavailable desc = last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 74.48.52.186:10160: connect: connection refused\"; last resolver error: produced zero addresses"}

Screenshots Link

No response

OpenIM-Robot commented 3 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


OpenIM Server Version

release-v3.8

Operating System and CPU Architecture

Linux (AMD)

Deployment Method

Source Code Deployment

Bug Description and Steps to Reproduce

Call api /auth/user_token once every minute. After starting the server, the call is normal. Something goes wrong after a while.

2024-08-18 17:07:19.831 | 2024-08-18 09:07:19.768 ERROR [PID:3693] openim-api [version:3.8.0] [mw/rpc_client_interceptor.go:50] RPC Client Response Error - userToken {"operationID": "8b2f17f7792ecd1e", "funcName": "/openim.auth.Auth/userToken", "error": "rpc error: code = Unavailable desc = last connection error: connection error: desc = \" transport: Error while dialing: dial tcp 74.48.52.186:10160: connect: connection refused\"; last resolver error: produced zero addresses"}

Screenshots Link

No response

OpenIM-Robot commented 3 months ago

Hello! Thank you for filing an issue.

If this is a bug report, please include relevant logs to help us debug the problem.

Join slack 🤖 to connect and communicate with our developers.

skiffer-git commented 3 months ago

First, run the command "mage check" to check . After that, take a look at the results. Then, review the output to understand what it shows.

happy2wh666 commented 3 months ago
root@host1:/openim# mage check
[2024-08-18 14:10:01 UTC] All services are running normally.
[2024-08-18 14:10:01 UTC] Display details of the ports listened to by the service:
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-push -i 0 -c /openim/config/, PID: 4069 is listening on ports: 20107, 10170
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-rpc-conversation -i 0 -c /openim/config/, PID: 4090 is listening on ports: 20105, 10180
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-rpc-friend -i 0 -c /openim/config/, PID: 4126 is listening on ports: 20104, 10120
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-rpc-third -i 0 -c /openim/config/, PID: 4131 is listening on ports: 20101, 10190
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-rpc-user -i 0 -c /openim/config/, PID: 4068 is listening on ports: 10110, 20100
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-rpc-group -i 0 -c /openim/config/, PID: 4121 is listening on ports: 20103, 10150
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-rpc-msg -i 0 -c /openim/config/, PID: 4070 is listening on ports: 20102, 10130
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 0 -c /openim/config/, PID: 4100 is listening on ports: 20108
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 1 -c /openim/config/, PID: 4105 is listening on ports: 20109
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 2 -c /openim/config/, PID: 4110 is listening on ports: 20110
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-msgtransfer -i 3 -c /openim/config/, PID: 4116 is listening on ports: 20111
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-rpc-auth -i 0 -c /openim/config/, PID: 4096 is listening on ports: 20106, 10160
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-api -i 0 -c /openim/config/, PID: 4080 is listening on ports: 20113, 10002
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-crontask -i 0 -c /openim/config/, PID: 4074 is not listening on any ports.
[2024-08-18 14:10:02 UTC] Cmdline: /openim/_output/bin/platforms/linux/amd64/openim-msggateway -i 0 -c /openim/config/, PID: 4084 is listening on ports: 20112, 10001, 10140
happy2wh666 commented 3 months ago
2024-08-18 14:11:17.342 INFO    [PID:4096]      openim-rpc-auth                 [version:3.8.0]         [mw/rpc_server_interceptor.go:48]                       RPC Server Request - UserToken                          {"operationID": "8b2f17f7792ecd1e", "funcName": "/openim.auth.Auth/UserToken", "req": "secret:\"666\"  platformID:1  userID:\"imAdmin\""}
2024-08-18 14:11:17.343 DEBUG   [PID:4096]      openim-rpc-auth                 [version:3.8.0]         [mw/rpc_client_interceptor.go:44]                       RPC Client Request - getDesignateUsers                  {"operationID": "8b2f17f7792ecd1e", "funcName": "/openim.user.user/getDesignateUsers", "req": "userIDs:\"imAdmin\"", "conn target": "etcd:///openim/user"}
2024-08-18 14:11:17.344 ERROR   [PID:4096]      openim-rpc-auth                 [version:3.8.0]         [mw/rpc_client_interceptor.go:50]                       RPC Client Response Error - getDesignateUsers           {"operationID": "8b2f17f7792ecd1e", "funcName": "/openim.user.user/getDesignateUsers", "error": "rpc error: code = Unavailable desc = last resolver error: produced zero addresses"}
2024-08-18 14:11:17.344 WARN    [PID:4096]      openim-rpc-auth                 [version:3.8.0]         [mw/rpc_server_interceptor.go:97]                       rpc server resp WithDetails error                       {"operationID": "8b2f17f7792ecd1e", "funcName": "/openim.auth.Auth/UserToken", "error": "Error: 14 last resolver error: produced zero addresses | Error trace: 1 (/go/pkg/mod/google.golang.org/grpc@v1.62.1/server.go:1027) -> handleStream (/go/pkg/mod/google.golang.org/grpc@v1.62.1/server.go:1797) -> processUnaryRPC (/go/pkg/mod/google.golang.org/grpc@v1.62.1/server.go:1386) -> _Auth_UserToken_Handler (/go/pkg/mod/github.com/openimsdk/protocol@v0.0.69/auth/auth.pb.go:973) -> func1 (/go/pkg/mod/google.golang.org/grpc@v1.62.1/server.go:1194) -> RpcServerInterceptor (/go/pkg/mod/github.com/openimsdk/tools@v0.0.49-alpha.55/mw/rpc_server_interceptor.go:53) -> func1 (/go/pkg/mod/google.golang.org/grpc@v1.62.1/server.go:1203) -> func7 (/openim/pkg/common/startrpc/start.go:199) -> func1 (/go/pkg/mod/github.com/openimsdk/protocol@v0.0.69/auth/auth.pb.go:971) -> UserToken (/openim/internal/rpc/auth/auth.go:78) -> GetUserInfo (/openim/pkg/rpcclient/user.go:90) -> GetUsersInfo (/openim/pkg/rpcclient/user.go:74) -> GetDesignateUsers (/go/pkg/mod/github.com/openimsdk/protocol@v0.0.69/user/user.pb.go:5304) -> Invoke (/go/pkg/mod/google.golang.org/grpc@v1.62.1/call.go:35) -> RpcClientInterceptor (/go/pkg/mod/github.com/openimsdk/tools@v0.0.49-alpha.55/mw/rpc_client_interceptor.go:66) -> Wrap (/go/pkg/mod/github.com/openimsdk/tools@v0.0.49-alpha.55/errs/coderr.go:74) -> Wrap (/go/pkg/mod/github.com/openimsdk/tools@v0.0.49-alpha.55/errs/coderr.go:126)"}
2024-08-18 14:11:17.344 WARN    [PID:4096]      openim-rpc-auth                 [version:3.8.0]         [mw/rpc_server_interceptor.go:116]                      RPC Server Response Error - UserToken                   {"operationID": "8b2f17f7792ecd1e", "funcName": "/openim.auth.Auth/UserToken", "req": "secret:\"666\"  platformID:1  userID:\"imAdmin\"", "err": "<nil>", "error": "rpc error: code = Unavailable desc = 14 last resolver error: produced zero addresses"}
2024-08-18 14:11:17.345 ERROR   [PID:4080]      openim-api                      [version:3.8.0]         [mw/rpc_client_interceptor.go:50]                       RPC Client Response Error - userToken                   {"operationID": "8b2f17f7792ecd1e", "funcName": "/openim.auth.Auth/userToken", "error": "rpc error: code = Unavailable desc = 14 last resolver error: produced zero addresses"}
skiffer-git commented 3 months ago

When the message "produced zero addresses" appears, you need to take action. At that point, run the command "mage check".

skiffer-git commented 3 months ago

How did you set it up? What changes did you make to the settings? Can you explain it step by step?

happy2wh666 commented 3 months ago

The mage check in the above content is executed after an error appears in the logs


Use a Docker-based Golang environment to compile and run the server.

services:
  openim:
    image: golang
    container_name: openim
    user: root
    privileged: true
    volumes:
      - "/openim:/openim"
    restart: always
    network_mode: "host"
    command: /openim/happy.sh

docker startup command

usm@/openim$ cat ./happy.sh
#!/bin/sh
cd /openim
bash bootstrap.sh
if [ ! -e skip_build ]
then
  mage
fi
touch skip_build
mage start
mage check
tail -f /dev/stdout
skiffer-git commented 3 months ago

I suggest you use the source code for the deployment.

skiffer-git commented 3 months ago

Can you tell me if the IP address 74.48.52.186 is a public one or just an internal network address?

happy2wh666 commented 3 months ago

I suggest you use the source code for the deployment.

The server is compiled from source code. Both the compilation and execution are done inside a Docker container with image “golang”.

Can you tell me if the IP address 74.48.52.186 is a public one or just an internal network address?

public, with linux firewall

skiffer-git commented 3 months ago

There's no need to use a public IP address. You can just stick with an internal IP address for this situation

happy2wh666 commented 3 months ago

Is this issue related to the internal IP or public IP? The point is that the server was running normally at the beginning, but the error occurred after a period of time. Restarting the server made it normal again, but then the error occurred again after some time.

skiffer-git commented 3 months ago

When you run into an issue, take a look at the data on etcd.

happy2wh666 commented 3 months ago

I keep running the follow command for a long time, during which it only outputs some PUT and DELETE.

I have no name!@host:/opt/bitnami/etcd$ etcdctl watch / --prefix

while read -r line; do
    echo "New key: $line"
done

PUT
/check_openim_component

DELETE
/check_openim_component

PUT
/check_openim_component

DELETE
/check_openim_component

...
...
...
skiffer-git commented 3 months ago

try to watch /openim

happy2wh666 commented 3 months ago

watch / --prefix has include watch /openim

happy2wh666 commented 3 months ago

I deployed the server and all other components like redis and etcd kafka on the same server, and the issue has not occurred again.

The issue may be caused by a communication problem between the server and other components that has not been resolved. For example, the server may not have properly reconnected after losing connection with Redis.

skiffer-git commented 1 week ago

We'll be testing it soon. Thanks for bringing this up.

icey-yu commented 6 days ago

We deployed Mongo and Redis separately from the server and tested them for a while, but did not encounter this issue. You can try updating the code and testing again. If the issue persists, please provide more specific steps to reproduce the problem.

skiffer-git commented 6 days ago

I keep running the follow command for a long time, during which it only outputs some PUT and DELETE.

I have no name!@host:/opt/bitnami/etcd$ etcdctl watch / --prefix

while read -r line; do
    echo "New key: $line"
done

PUT
/check_openim_component

DELETE
/check_openim_component

PUT
/check_openim_component

DELETE
/check_openim_component

...
...
...

etcdctl get "" --prefix --keys-only openim/admin/10.3.0.11:30200

openim/auth/10.3.0.11:10200

openim/chat/10.3.0.11:30300

openim/conversation/10.3.0.11:10220

openim/encryption/10.3.0.11:10500

openim/friend/10.3.0.11:10240

openim/group/10.3.0.11:10260

openim/meeting/10.3.0.11:10112

openim/messageGateway/10.3.0.11:10140

openim/msg/10.3.0.11:10280

openim/office/10.3.0.11:30400

openim/organization/10.3.0.11:30500

openim/push/10.3.0.11:10170

openim/push/10.3.0.11:10171

openim/push/10.3.0.11:10172

openim/push/10.3.0.11:10173

openim/push/10.3.0.11:10174

openim/push/10.3.0.11:10175

openim/push/10.3.0.11:10176

openim/push/10.3.0.11:10177

openim/signal/10.3.0.11:10212

openim/third/10.3.0.11:10300

openim/user/10.3.0.11:10320