sofastack / sofa-rpc

SOFARPC is a high-performance, high-extensibility, production-level Java RPC framework.
https://www.sofastack.tech/sofa-rpc/docs/Home
Apache License 2.0
3.81k stars 1.16k forks source link

rpc 调用超时没有丢弃reponse #1410

Open zonghaishang opened 2 months ago

zonghaishang commented 2 months ago

Describe the bug

A clear and concise description of what the bug is.

https://github.com/sofastack/sofa-rpc/blob/bd25a4e894fc7a1fde00ba15699ea3ee12aa8c16/remoting/remoting-bolt/src/main/java/com/alipay/sofa/rpc/server/bolt/BoltServerProcessor.java#L180

    @Override
    public void handleRequest(BizContext bizCtx, AsyncContext asyncCtx, SofaRequest request) {
                   // ... before code
                    response = doInvoke(serviceName, invoker, request);
                    if (bizCtx.isRequestTimeout()) { // 加上丢弃超时的响应的逻辑
                        throwable = clientTimeoutWhenSendResponse(appName, serviceName, bizCtx.getRemoteAddress());
                        break invoke;
                    }

            //  !!!! Bad, 超时的情况下, response 不为空也会导致发送给client,
            //  没有达到丢弃的效果...
            if (response != null) {

                try {
                        asyncCtx.sendResponse(response);
                    } finally {
                        if (EventBus.isEnable(ServerSendEvent.class)) {
                            EventBus.post(new ServerSendEvent(request, response, throwable));
                        }
                    }
            }

Expected behavior

throwable = clientTimeoutWhenSendResponse(appName, serviceName, bizCtx.getRemoteAddress());
// append response

// todo setup  reponse to null ? or  err message ?
response = MessageBuilder.buildSofaErrorResponse(throwable.getMessage());

Actual behavior

Steps to reproduce

Minimal yet complete reproducer code (or GitHub URL to code)

Environment

OrezzerO commented 2 months ago

Agree! How about submit a pr to fix it @zonghaishang

seeflood commented 2 months ago

这个有人修么,没人修的话我来?正好最近没事干

EvenLjj commented 2 months ago

在多笔请求合并时,bolt 层会共用同一个RemotingContext,到rpc侧处理不同的请求时,根据bizCtx.isRequestTimeout() 这种判断方式去判定超时不一定准确。如果某个接口设置比较小的超时时间,和其他请求合并处理时,会造成预期之外的超时问题。 @seeflood 修这个问题的时候,一并看下怎么处理这块吧。 image

chuailiwu commented 2 months ago

在多笔请求合并时和这个超时检查应该是冲突的 可以做一个修复逻辑,即如果是多笔请求合并,让超时检查失效 image

另看代码里还留了一个口子,com.alipay.remoting.codec.AbstractBatchDecoder#setSingleDecode 控制只返回一个msg,可以一起评估下怎么来提供这个能力比较好

seeflood commented 2 months ago

我不太了解“多笔请求合并”场景的已有逻辑,我确认下: 看上面的讨论,意思是期望改成啥样:合并请求场景,不做超时检查? 不做超时检查的话有点奇怪啊,那业务处理耗时太长咋办……

具体修改方法:是不是在 bolt 里,判断 msg 是 List后,直接 setTimeout(0) 就行? 设置为0后,后面的超时检查就失效了(永远不会超时) image

或者哪位老哥明天语音聊下?

EvenLjj commented 2 months ago

@seeflood 这个并发问题在 bolt 1.6.8 版本修复了,https://github.com/sofastack/sofa-bolt/pull/347

stale[bot] commented 1 day ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.