3scale / APIcast

3scale API Gateway
Apache License 2.0
305 stars 171 forks source link

Reducing dependence on System #807

Closed y-tabata closed 5 years ago

y-tabata commented 6 years ago

Requirements

In order to enhance robustness, we should reduce the dependence of each 3scale component. For example, APIcast should be able to call APIs without System.

Proposals

Regarding the apicast-production, the configuration is cached 300 seconds (the default value of "APICAST_CONFIGURATION_CACHE"). So if the system-app restarts within 300 seconds, we may be able to call APIs without receiving 5XX. However, if the system-app doesn't restart within 300 seconds, we receive 5XX.

Our proposal is to keep the configuration cache and use it as long as APIcast receive 5XX from System. When APIcast receive 5XX from System, APIcast allows the API request to go through by using the cached configuration, however APIcast polls whether System is recoveried or not in a periodical interval.

andrewdavidmackenzie commented 6 years ago

I suggest that we keep the body of this discussion focused on "dependence on System" as the title.

The body of the issue adds "Backend" which is a completely different dependency for different reasons, and I think best discussed in a separate thread.

If you remove reference to Backend, then I can remove this comment and keep history clean.

y-tabata commented 6 years ago

Maybe this has already achieved, right? In my environment, when I scaled the system-app from 1 to 0, we could receive 200 from the apicast-production after more than 300 seconds.

mikz commented 6 years ago

@y-tabata we haven't made any changes to this in quite some time.

Configuration Store is responsible for storing Service configurations and will use stale configuration by default: https://github.com/3scale/apicast/blob/0fa2ed46f98d1eea7b65e1f52fc9b8709ef934ae/gateway/src/apicast/configuration_store.lua#L65-L82

andrewdavidmackenzie commented 6 years ago

Apicast only contacts system to load proxy config.

When it does that is configurable, and default is different on production/staging.

So, once configured, system is not contacted.

I believe the config options should be documented somewhere, if not, they should be.

On Fri, Jul 20, 2018, 09:34 Yoshiyuki Tabata notifications@github.com wrote:

Maybe this has already achieved, right? In my environment, when I scaled the system-app from 1 to 0, we could receive 200 from the apicast-production after more than 300 seconds.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/3scale/apicast/issues/807#issuecomment-406514771, or mute the thread https://github.com/notifications/unsubscribe-auth/AFReLFfJ6PDoaxhZPUU11Q9uizUC_KPTks5uIYf_gaJpZM4VKpPH .

-- regards Andrew

Andrew Mackenzie Director of Software Engineering 3scale by Red Hat

y-tabata commented 6 years ago

@mikz @andrewdavidmackenzie thanks!

The "APICAST_CONFIGURATION_CACHE" is written here and here.

However, it seems that there is no description about how the gateway behaves when the gateway cannot get the configuration.

I'm not sure the above is the proper place to describe the behavior and I'm not sure we need to write this specific corner case behavior.

mikz commented 6 years ago

@y-tabata we definitely should have an integration test to verify it retains stale configuration when the configuration endpoint is down.

We have a unit test to verify that the store returns stale records by default: https://github.com/3scale/apicast/blob/123a1705aee504813503bd739cf0ed48cacc23c9/spec/configuration_store_spec.lua#L142-L151

But we should have a high level integration test to verify all combined together still works.

Also it is entirely possible that you hit some edge case and in some conditions it might not work as expected. It might have been related to the number of services running, for how long it was running, etc. If you can reproduce that I'm more than happy to write a test for it and fix it.

y-tabata commented 5 years ago

I confirmed this worked well. I close this issue.