Open LiChengZiMu opened 2 years ago
你是怎么连的呀?每个etcd节点一个client么?
单client是支持多endpoint,如果一个节点失效了会自然用另两个的:https://github.com/etcd-cpp-apiv3/etcd-cpp-apiv3#multiple-endpoints
你也可以用 SetServiceConfigJSON
(https://grpc.github.io/grpc/cpp/classgrpc_1_1_channel_arguments.html#ae9399219c13808b45f3acad088fb0981) 来实现如图的效果,client的constructor是接受一个ChannelArguments 对象的。
你也可以用
SetServiceConfigJSON
(https://grpc.github.io/grpc/cpp/classgrpc_1_1_channel_arguments.html#ae9399219c13808b45f3acad088fb0981) 来实现如图的效果,client的constructor是接受一个ChannelArguments 对象的。
#include "etcd/Client.hpp"
#include "etcd/KeepAlive.hpp"
#include "etcd/Response.hpp"
#include "etcd/SyncClient.hpp"
#include "etcd/Value.hpp"
#include "etcd/Watcher.hpp"
#include <grpc++/grpc++.h>
#include <grpc++/security/credentials.h>
using grpc::Channel;
int main(int argc, char **argv) {
std::string endpoints = "http://127.0.0.1:2379,http://127.0.0.1:2389,http://127.0.0.1:2399";
etcd::Client etcd(endpoints);
auto keepalive = etcd.leasekeepalive(5).get();
auto lease_id = keepalive->Lease();
std::cout << lease_id << std::endl;
std::string value = std::string("192.168.1.6:1880") + argv[1];
auto resp1 = etcd.campaign("/leader", lease_id, value).get();
if (0 == resp1.error_code()) {
std::cout << "became leader: " << resp1.index() << std::endl;
} else {
std::cout << "error code: " << resp1.error_code()
<< "error message: " << resp1.error_message() << std::endl;
assert(false);
}
std::cout << "finish campaign" << std::endl;
auto resp2 = etcd.leader("/leader").get();
std::cout << resp2.value().as_string() << std::endl;
std::cout << resp2.value().key() << std::endl;
std::cout << "finish leader" << std::endl;
while (true) {
keepalive->Check();
}
return 0;
}
部署了三个实例的etcd集群,端口号分别为2379,2389,2399。在重启一个etcd实例的情况下,上述测试代码的keepalive->Check()会有概率抛出异常,从而终止程序
std::string configJson = "{\"methodConfig\": [{\"name\": [{\"service\": \"etcdserverpb.Lease\", \"method\": \"LeaseKeepAlive\"}], \"retryPolicy\": {\"maxAttempts\":5, \"initialBackoff\": \"0.1s\", \"maxBackoff\": \"1s\", \"backoffMultiplier\": 2.0, \"retryableStatusCodes\": [\"UNAVAILABLE\"]}}]}";
grpc_args.SetServiceConfigJSON(configJson);
grpc_args.SetInt(GRPC_ARG_ENABLE_RETRIES, 1);
I can reproduce the issue above.
Working on that.
KeepAlive fails to switch between subchannels as stream is stateful and cannot be replayed.
It would be fixed by open a new stream every time the refresh happens.
It would be costly, but has the tolerance for failures when multiple endpoints exist.
KeepAlive fails to switch between subchannels as stream is stateful and cannot be replayed.
It would be fixed by open a new stream every time the refresh happens.
好的,非常感谢
https://github.com/etcd-cpp-apiv3/etcd-cpp-apiv3/pull/135/commits @sighingnow maybe we just need to renew a keepalive obj to get a new stream when keepalive's exception occurs?
https://github.com/etcd-cpp-apiv3/etcd-cpp-apiv3/pull/135/commits @sighingnow maybe we just need to renew a keepalive obj to get a new stream when keepalive's exception occurs?
Exactly. But doing requires the client itself to be reconnectable, or, at least save required arguments for reconnecting. I have a draft patch for that and will submit the pull request after finishing the testing of retry logic for keep alive.
你好,解决了吗?
大佬,什么时候解决呀?
得下个月搞了
得下个月搞了
最近挺忙啊,都没消息 了
原因很简单 keepalive是一个双向流的grpc_context,你连接失败了这个keepalive对应的grpc_context就失败了,你需要的是当收到异常的时候 重新构建keepalive对象 重新创建一个grpc_context来继续请求
etcd三节点部署,此时客户端和etcd三个server都有tcp连接,一般来说,重启其中少数几点,不应该对上层抛错。grpc有一个 ChannelArguments::SetServiceConfigJSON接口,可以设置grpc出错重试机制,如下图:
想了解一下,是否有一个可以让该客户端比较稳定工作的grpc channel配置。
非常感谢