alibaba / Sentinel

A powerful flow control component enabling reliability, resilience and monitoring for microservices. (面向云原生微服务的高可用流控防护组件)
https://sentinelguard.io/
Apache License 2.0
22.32k stars 8k forks source link

Downgrade function extension discussion #868

Open linlinisme opened 5 years ago

linlinisme commented 5 years ago

Issue Description

Type: feature request

Describe what happened (or what feature you want)

now the sentinel degrade logic is as bellow

Public static void trace(Throwable e): record business exception (not BlockException)
Public static void trace(Throwable e, int count): Records the business exception. The number of exceptions is the counted in.
 public boolean passCheck(Context context, DefaultNode node, int acquireCount, Object... args) {
        if (cut.get()) {
            return false;
        }

        ClusterNode clusterNode = ClusterBuilderSlot.getClusterNode(this.getResource());
        if (clusterNode == null) {
            return true;
        }

        if (grade == RuleConstant.DEGRADE_GRADE_RT) {
            double rt = clusterNode.avgRt();
            if (rt < this.count) {
                passCount.set(0);
                return true;
            }

            // Sentinel will degrade the service only if count exceeds.
            if (passCount.incrementAndGet() < rtSlowRequestAmount) {
                return true;
            }
        } else if (grade == RuleConstant.DEGRADE_GRADE_EXCEPTION_RATIO) {
            double exception = clusterNode.exceptionQps();
            double success = clusterNode.successQps();
            double total = clusterNode.totalQps();
            // If total amount is less than minRequestAmount, the request will pass.
            if (total < minRequestAmount) {
                return true;
            }

            // In the same aligned statistic time window,
            // "success" (aka. completed count) = exception count + non-exception count (realSuccess)
            double realSuccess = success - exception;
            if (realSuccess <= 0 && exception < minRequestAmount) {
                return true;
            }

            if (exception / success < count) {
                return true;
            }
        } else if (grade == RuleConstant.DEGRADE_GRADE_EXCEPTION_COUNT) {
            double exception = clusterNode.totalException();
            if (exception < count) {
                return true;
            }
        }

        if (cut.compareAndSet(false, true)) {
            ResetTask resetTask = new ResetTask(this);
            pool.schedule(resetTask, timeWindow, TimeUnit.SECONDS);
        }

        return false;
    }

does any body found that : it's exception statistics is not distinguish the type。Actually exception can be divided into many types. The first is not a real exception, but it means that the processing result fails or succeeds.for example, some parameter verification fails or the business data status is incorrect, exceptions like that should not trigger the degrade. The second is the normal exception we think of, such as timeout or system error. It is ok to trigger service degrade. There is also a fatal exception and an exception that says the business can't be recovered for a long time. Once this exception occurs, we should immediately downgrade,rather thanAccumulate to a certain extent before degrading

Describe what you expected to happen

so I hope sentinel should make a more refined management of the downgrade exception.

sentinel-bot commented 5 years ago

Hi @linlinisme, we detect non-English characters in the issue. This comment is an auto translation from @sentinel-bot to help other users to understand this issue. We encourage you to describe your issue in English which is more friendly to other users.

Downgrade function extension discussion

Issue Description

Type: feature request

Describe what happened (or what feature you want)

now the sentinel degrade logic is as bellow

Public static void trace(Throwable e): record business exception (not BlockException)
Public static void trace(Throwable e, int count): Records the business exception. The number of exceptions is the counted in.
 public boolean passCheck(Context context, DefaultNode node, int acquireCount, Object... args) {
        if (cut.get()) {
            return false;
        }

        ClusterNode clusterNode = ClusterBuilderSlot.getClusterNode(this.getResource());
        if (clusterNode == null) {
            return true;
        }

        if (grade == RuleConstant.DEGRADE_GRADE_RT) {
            double rt = clusterNode.avgRt();
            if (rt < this.count) {
                passCount.set(0);
                return true;
            }

            // Sentinel will degrade the service only if count exceeds.
            if (passCount.incrementAndGet() < rtSlowRequestAmount) {
                return true;
            }
        } else if (grade == RuleConstant.DEGRADE_GRADE_EXCEPTION_RATIO) {
            double exception = clusterNode.exceptionQps();
            double success = clusterNode.successQps();
            double total = clusterNode.totalQps();
            // If total amount is less than minRequestAmount, the request will pass.
            if (total < minRequestAmount) {
                return true;
            }

            // In the same aligned statistic time window,
            // "success" (aka. completed count) = exception count + non-exception count (realSuccess)
            double realSuccess = success - exception;
            if (realSuccess <= 0 && exception < minRequestAmount) {
                return true;
            }

            if (exception / success < count) {
                return true;
            }
        } else if (grade == RuleConstant.DEGRADE_GRADE_EXCEPTION_COUNT) {
            double exception = clusterNode.totalException();
            if (exception < count) {
                return true;
            }
        }

        if (cut.compareAndSet(false, true)) {
            ResetTask resetTask = new ResetTask(this);
            pool.schedule(resetTask, timeWindow, TimeUnit.SECONDS);
        }

        return false;
    }

does any body found that : it's exception statistics is not distinguish the type。Actually exception can be divided into many types. The first is not a real exception, but it means that the processing result fails or succeeds.for example, some parameter verification fails or the business data status is incorrect, exceptions like that should not trigger the degrade. The second is the normal exception we think of, such as timeout or system error. It is ok to trigger service degrade. There is also a fatal exception and an exception that says the business can't be recovered for a long time. Once this exception occurs, we should immediately downgrade,rather thanAccumulate to a certain extent before degrading

Describe what you expected to happen

so I hope sentinel should make a more refined management of the downgrade exception.

linlinisme commented 5 years ago

It may also be necessary to add the entry to force open or close the downgrade.

sczyh30 commented 5 years ago

Thanks for your suggestion. There are already issues here: #606 and #790, maybe we can discuss in these issues.