Closed ryan4yin closed 1 year ago
the unit of upstream's timeout is seconds, so the minimal timeout for the primary server is 1 second, all requests will get stuck for 1 second before fallback to the fallback server, which is too long for high traffic, it really hurts if the primary system goes down completely!
We can make the unit of upstream timeout in milliseconds to fix this. It should be a simple fix.
@shreemaan-abhishek I'm really looking forward to this ❤️
I just checked the code and realised that the minimal timeout should be greater than zero.
i.e if you provide the timeout as: {"connect": 0.1,"send": 0.1,"read": 0.1}
you the effective timeout duration would be 100ms.
@shreemaan-abhishek Ok, maybe I got it wrong. I'll take the time to confirm that.
@ryan4yin do you have any further updates/questions? If not please close this issue. Thanks.
Description
Our usage scenario is that we want to use APISIX to handle iterations between the new system and the old one.
Because the new system may have potential performance or stability problems running for a long time, to ensure the availability of the whole system, we implemented a deployment method to make APISIX passing requests to the new system by default, and use the old system as a fallback server.
Considering that others might have the same need, I created this issue to record it and wanted to discuss the possibility to add it to APISIX's FAQ.
related to:
@tzssangglass helped me to implement this feature, thanks again!
How to implement this
The whole workaround describes below.
First, create an upstream, and set the old system's
priority
to-1
, thus the old system wil be marked as a backup server. It will be passed requests only when the primary servers(the new system) are unavailable.and then we need to define in what scenario we think the new system is unavailable, so the request will be passed to the old system. so archive this goal, we should add configuration into APISIX's
config.yaml
:With those two configurations, through these configurations, we can implement the fallback feature mentioned above using APISIX. Generally, if there is a problem with the new system, the retry mechanism of APISIX we configured here will always be triggered, so the requests can always be processed properly, and no users are affected.
Drawbacks
This implementation is really helpful for me, but there is also some drawbacks:
response_status
in order.100ms
just like the timeout parameters inproxy-mirror
: