Closed chaoyoung closed 6 months ago
目前通过外部先摘除后端入口HTTP流量解决了此问题。
Spring Boot application.yml
配置graceful shutdown
server:
shutdown: graceful
spring:
lifecycle:
timeout-per-shutdown-phase: 30s
应用程序提供实例下线的接口
@RestController
@RequiredArgsConstructor
@Slf4j
public class GracefulShutdownController {
private final NacosAutoServiceRegistration nacosAutoServiceRegistration;
@PostMapping("/instance/deregister")
public void deregisterInstance() {
nacosAutoServiceRegistration.stop();
log.info("instance deregistered from nacos.");
}
}
通过k8s lifecycle.preStop 执行curl命令调用实例下线的接口,将入口http流量先摘除,然后等Spring和Dubbo graceful shutdown,最多等待30s。
apiVersion: apps/v1
kind: Deployment
metadata:
name: xxx-backend
namespace: default
labels:
app: xxx-backend
spec:
replicas: 1
selector:
matchLabels:
app: xxx-backend
template:
metadata:
labels:
app: xxx-backend
spec:
containers:
- name: xxx-backend
image: registry.cn-hangzhou.aliyuncs.com/namespace/xxx-backend:latest
imagePullPolicy: Always
ports:
- containerPort: 8080
name: http
protocol: TCP
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "curl -X POST localhost:8080/instance/deregister"]
terminationGracePeriodSeconds: 30
因为使用了Spring Cloud Gateway作为后端微服务网关,所以Gateway这边也需要在实例下线的时候清理缓存的操作。
/**
* 订阅Nacos通知
* 接收Nacos推送的微服务上下线实例信息
*/
@Component
@Slf4j
public class NacosInstancesChangeEventListener extends Subscriber<InstancesChangeEvent> {
@Resource
private CacheManager caffeineLoadBalancerCacheManager;
@PostConstruct
public void registerToNotifyCenter() {
NotifyCenter.registerSubscriber(this);
}
@Override
public void onEvent(InstancesChangeEvent event) {
log.info("Spring Gateway 接收实例刷新事件:{}, 开始刷新缓存", JacksonUtils.toJson(event));
Cache cache = caffeineLoadBalancerCacheManager.getCache(SERVICE_INSTANCE_CACHE_NAME);
if (cache != null) {
cache.evict(event.getServiceName());
}
log.info("Spring Gateway 实例刷新完成");
}
@Override
public Class<? extends com.alibaba.nacos.common.notify.Event> subscribeType() {
return InstancesChangeEvent.class;
}
}
实测通过preStop hook直接调用Nacos APIcurl -X PUT "http://nacos-headless.nacos.svc.cluster.local:8848/nacos/v1/ns/instance?serviceName=xxx-backend&namespaceId=&clusterName=DEFAULT&groupName=DEFAULT_GROUP&ip=$POD_IP&port=8080&ephemeral=true&enabled=false&username=$NACOS_USERNAME&password=$NACOS_PASSWORD"
完成服务下线,避免每个微服务还需要单独增加服务下线接口。
apiVersion: apps/v1
kind: Deployment
metadata:
name: xxx-backend
namespace: default
labels:
app: xxx-backend
spec:
replicas: 1
selector:
matchLabels:
app: xxx-backend
template:
metadata:
labels:
app: xxx-backend
spec:
containers:
- name: xxx-backend
image: registry.cn-hangzhou.aliyuncs.com/namespace/xxx-backend:latest
imagePullPolicy: Always
ports:
- containerPort: 8080
name: http
protocol: TCP
env:
- name: NACOS_SERVER_ADDR
valueFrom:
configMapKeyRef:
name: common-cm
key: nacos.server-addr
- name: NACOS_USERNAME
valueFrom:
configMapKeyRef:
name: common-cm
key: nacos.username
- name: NACOS_PASSWORD
valueFrom:
configMapKeyRef:
name: common-cm
key: nacos.password
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
lifecycle:
preStop:
exec:
command:
- bash
- -c
- curl -X PUT "http://nacos-headless.nacos.svc.cluster.local:8848/nacos/v1/ns/instance?serviceName=xxx-backend&namespaceId=&clusterName=DEFAULT&groupName=DEFAULT_GROUP&ip=$POD_IP&port=8080&ephemeral=true&enabled=false&username=$NACOS_USERNAME&password=$NACOS_PASSWORD"
terminationGracePeriodSeconds: 30
No news is good news. Please feel free to create a new issue if you have any question.
Environment
Steps to reproduce this issue
Expected Behavior
Dubbo消费者能在Spring Web容器关闭后再关闭,这样才能做到流量无损
Actual Behavior
Dubbo消费者调用出错:Directory of type RegistryDirectory already destroyed for service xxx.
不知是否跟这个有关:
[DubboShutdownHook] org.apache.dubbo.config.DubboShutdownHook Line:130 - [DUBBO] Dubbo wait for application(Dubbo Application[1.1](crm-dubbo)) managed by Spring to be shutdown failed, time usage: 33ms