grpc: Server.processUnaryRPC failed to write status connection error: desc = "transport is closing

geotransformer commented 3 years ago

ETCD: 3.3.17 K8S: 1.16.2 Calico: v3.5.2

1> Two etcd clusters were installed on V1.16.2 k8s(With external ETCD cluster). On the same three VMs in openstack.

2> Restart apiserver/scheduler container to renew certs. Seems 5 seconds later the apiserver was back.

 Apr 15 10:29:43 xxx-master1 5e69953db08f[7886]: I0415 10:29:43.947496  controller.go:182] Shutting down kubernetes service endpoint reconciler
 Apr 15 10:29:49 xxx-master1 5e69953db08f[7886]: I0415 10:29:49.556440       1 controller.go:130] OpenAPI AggregationController: action for item k8s_internal_local_delegation_chain_0000000000: Nothing (removed from the queue).

3> Observed the following logs for these two etcd clusters

 etcd cluster 1
 2021-04-15 10:29:48.355973 W | etcdserver: read-only range request "key:xxxx" " with result "range_response_count:0 size:8" took too long (328.069421ms) to execute
 WARNING: 2021/04/15 10:29:54 grpc: Server.processUnaryRPC failed to write connection error: desc = "transport is closing" 

 etcd cluster 2
 2021-04-15 10:29:48.324249 W | etcdserver: read-only range request "key:xxxx" " with result "range_response_count:649 size:299214" took too long (675.913129ms) to execute
 WARNING: 2021/04/15 10:29:53 grpc: Server.processUnaryRPC failed to write status connection error: desc = "transport is closing"

Questions: 1> Did the apiserver restart cause these logs? The log timing showed they are related closely. 2> Did these logs indicate that the other service cannot connect to ETCD services?

Thanks,

xiang90 commented 3 years ago

1> Did the apiserver restart cause these logs? The log timing showed they are related closely.

Probably. With the limited information, I am not 100% sure.

Did these logs indicate that the other service cannot connect to ETCD services?

It indicates that etcd failed to write the data back to the client. Not the other way around, as you mentioned.

opsarno commented 3 years ago

@xiang90 Hi bro, I had same issue.

Member list

+------------------+---------+-------+--------------------------+--------------------------+
|        ID        | STATUS  | NAME  |        PEER ADDRS        |       CLIENT ADDRS       |
+------------------+---------+-------+--------------------------+--------------------------+
| 26ebcf7897bb67ac | started | node1 | http://10.1.1.108:2380 | http://10.1.1.108:2379 |
| 9ca0ae4f66daf77f | started | node3 | http://10.1.1.110:2380 | http://10.1.1.110:2379 |
| b0028d2d9dfb7f73 | started | node2 | http://10.1.1.109:2380 | http://10.1.1.109:2379 |
+------------------+---------+-------+--------------------------+--------------------------+

Endpoint health

+-------------------+--------+-----------+-------+
|     ENDPOINT      | HEALTH |   TOOK    | ERROR |
+-------------------+--------+-----------+-------+
| 10.1.1.109:2379 |   true |  1.6666ms |       |
| 10.1.1.110:2379 |   true | 1.36167ms |       |
| 10.1.1.108:2379 |   true | 970.217µs |       |
+-------------------+--------+-----------+-------+

Endpoint status

+-------------------+------------------+---------+---------+-----------+-----------+------------+
|     ENDPOINT      |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+-------------------+------------------+---------+---------+-----------+-----------+------------+
| 10.1.1.108:2379 | 26ebcf7897bb67ac |  3.2.32 |  199 MB |      true |         7 |     198179 |
| 10.1.1.109:2379 | b0028d2d9dfb7f73 |  3.2.32 |  199 MB |     false |         7 |     198181 |
| 10.1.1.110:2379 | 9ca0ae4f66daf77f |  3.2.32 |  199 MB |     false |         7 |     198182 |
+-------------------+------------------+---------+---------+-----------+-----------+------------+

Log (WARN/ERR)


Aug 08 19:58:37 sys-etcd-01 etcd[20631]: WARNING: 2021/08/08 19:58:37 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"

Aug 08 19:58:37 sys-etcd-01 etcd[20631]: read-only range request "key:\"/mytest/grpc_endpoints/imkf_bg\" " with result "error:auth: invalid auth token" took too long (9.995255181s) to execute Aug 08 19:58:37 sys-etcd-01 etcd[20631]: read-only range request "key:\"/mytest/sql_log/\" range_end:\"/mytest/sql_log0\" " with result "error:auth: invalid auth token" took too long (9.996618998s) to execute

Aug 08 19:58:37 sys-etcd-01 etcd[20631]: WARNING: 2021/08/08 19:58:37 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"



Question:
1> `grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"`  what's the reason ?
2> `error:auth: invalid auth token` Is it because the client is using the wrong password ?
3> How can the IP address of each client request be displayed in the log ?

chlam4 commented 3 years ago

I recently encountered the same error quote below, and it was due to the etcd default storage size quota - https://etcd.io/docs/v3.3/dev-guide/limit/ - my etcd cluster is running out of storage. I was able to fix the problem by increasing it and then disable the alarm (https://help.compose.com/docs/etcd-alarms-and-status-messages-etcd3).

grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"

eadou commented 2 years ago

我最近在下面遇到了同样的错误引用，这是由于 etcd 默认存储大小配额 - https://etcd.io/docs/v3.3/dev-guide/limit/ - 我的 etcd 集群存储空间不足. 我能够通过增加它然后禁用警报来解决问题（https://help.compose.com/docs/etcd-alarms-and-status-messages-etcd3）。

grpc：Server.processUnaryRPC 无法写入状态：连接错误：desc =“传输正在关闭”

修改了--auto-compaction-retention=1 --max-request-bytes=10485760 --quota-backend-bytes=8589934592 这几个参数，也没有效果，问题依旧在，etcd依旧报错：grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"

etcd-io / etcd

grpc: Server.processUnaryRPC failed to write status connection error: desc = "transport is closing #12895