3scale / APIcast

3scale API Gateway
Apache License 2.0
305 stars 170 forks source link

Reducing dependence on Backend #808

Closed y-tabata closed 5 years ago

y-tabata commented 6 years ago

Requirements

In order to enhance robustness, we should reduce the dependence of each 3scale component. For example, APIcast should be able to call APIs without Backend.

Proposals

Regarding the caching policy, when we use resilient, we can authorize or deny the API request using the cache. However, it is only enabled when the access token is cached, so we cannot authorize or deny the API request with a new token, for example when a new user calls the API or the token is refreshed.

Our proposal is to authenticate the API request with a new token. In my understanding, the way to authorize or deny by using the cache is a temporary authentication method. So similarily I'd like to provide the temporary authentication method for the API request with a new token. For example, replacing authrep with token introspection + signature verification + keycloak role check. Of course, some functions like rate limiting, analytics, billing using Backend are unusable, however these functions are for API admin. For client applications and endusers, it is much beneficial that the Backend's down becomes not to affect on the API authentication.

andrewdavidmackenzie commented 6 years ago

Same comment here. Requirements section mentions System, but issue is focussed on Backend. Copy and Paste I suspect :-)

y-tabata commented 6 years ago

According to this comment, is there a plan to separate the authrep from the apicast policy? If so, I think it's better to proceed this after the plan.

mikz commented 6 years ago

@y-tabata authrep has to stay in the APIcast policy. It is the core of 3scale functionality. We plan to rename "APIcast" policy to "3scale" (https://github.com/3scale/apicast/pull/809).

As mentioned in https://github.com/3scale/apicast/issues/805 3scale API Manager UI will always require APIcast/3scale policy to be present, because it needs it to make Analytics, Billing, Usage limits, ...

When APIcast is used as standalone with configuration not generated by the 3scale API Manager it can operate without APIcast/3scale policy.

y-tabata commented 6 years ago

@mikz thanks! So this is related to #795 closely. When #795 will be achieved, the dependency on Backend will be little, right? If so, we'd like to know the milestone of #795.

mikz commented 6 years ago

@y-tabata for the features that 3scale API management provides the dependency on backend will be there for now. For features like the 3scale API Manager UI, Analytics, Billing etc. we need to rely on Backend. That dependency is not going away.

You'll be able to write the APIcast configuration by hand ( https://github.com/3scale/apicast/issues/795) and start APIcast without connecting to 3scale Backend, but it will not respect the configuration provided in the UI, the Usage Limits, Pricing, Analytics or Applications. It will be entirely disconnected from 3scale using only OIDC or some other authentication methods provided by some other policy. I guess the main issue here is you won't get analytics without reporting them to 3scale backend. Maybe the plan here should be to switch from authrep to report by using the 3scale batcher policy https://github.com/3scale/apicast/pull/685.

Regarding #795 milestone: I will be upstream only feature for now and won't be part of supported product in the next release. But we will likely complete basics in the next few weeks. Our proposal in https://github.com/3scale/apicast/issues/795#issuecomment-404080552 includes several new features, that will take some time to implement.

y-tabata commented 6 years ago

Maybe the plan here should be to switch from authrep to report by using the 3scale batcher policy #685.

This is very good. As long as the 3scale Backend works well, there is no problem with the APIcast working in the same way as now. When the 3scale Backend is down, the following is an ideal:

mikz commented 6 years ago

@y-tabata you can use the "3scale caching" policy to let the traffic pass when backend is unavailable by using the 'allow' mode.

Combining that with ODIC + token introspection + keycloak role check should work as intended, no?

The only concern I have is the missing reports when 3scale was down. If we add 3scale batcher to the mix it would be active all the time, not just when 3scale backend is down. Maybe we could make it active only when 3scale backend is down by using the #812. But I'm not sure if that is the right approach.

If we agree that OIDC + token introspection + keycloak role check is enough for authentication and authorization even without 3scale, then we can disable 3scale authentication and authorization when 3scale backend is down by "allow" mode of the caching policy. Is that correct?

If that is right, then one remaining point is how to report traffic correctly from periods when 3scale backend is unreachable. If that is right we should define some requirements how this should work before we get to work.

Then the last point is the rate limits. Using the #760 will work only as standalone rate limit without integration with 3scale features like Usage Rules, Pricing Rules or Analytics.

I'd recommend considering using 3scale for rate limits and possibly using #760 as fallback only when 3scale backend is unreachable.

y-tabata commented 6 years ago

you can use the "3scale caching" policy to let the traffic pass when backend is unavailable by using the 'allow' mode.

I missunderstood the 'allow' mode. When we specify the 'allow' mode, we can allow non-cached calls, right?

Combining that with ODIC + token introspection + keycloak role check should work as intended, no?

Yes. It works.

If we agree that OIDC + token introspection + keycloak role check is enough for authentication and authorization even without 3scale, then we can disable 3scale authentication and authorization when 3scale backend is down by "allow" mode of the caching policy. Is that correct?

Yes, correct.

Then the last point is the rate limits.

Since the objective of the rate limit policy is different from that of the 3scale rate limit, we don't need to switch to the rate limit policy when 3scale backend is unreachable, I think. It's the API admin's responsibility that 3scale backend is down. And only the API admin is affected when the 3scale rate limit is not working. (the endusers or the system are not affected)

The only concern I have is the missing reports when 3scale was down. If we add 3scale batcher to the mix it would be active all the time, not just when 3scale backend is down. Maybe we could make it active only when 3scale backend is down by using the #812. But I'm not sure if that is the right approach.

There is no disadvantage when 3scale batcher is active all the time, I think. When it cannot report to backend, it retry. If we'd like to keep the freshness, we can shorten the value of batch_report_seconds.

mikz commented 6 years ago

There is no disadvantage when 3scale batcher is active all the time, I think.

The disadvantage is that the the reporting is done after batch_report_seconds and not immediately. The delay can be short, but there is a delay. And during that time the requests can go over the rate limit as mentioned in the README.

If we agree that OIDC + token introspection policy + Keycloak role check policy + 3scale batcher policy is the way to go, then we can focus on defining what needs to change in the 3scale batcher policy to make it "production ready".

edit:

Yes, allow mode will allow all the traffic (even non cached). https://github.com/3scale/apicast/blob/1b3955fdd0c5cc9c08513bcb4127bd26f54b1e69/gateway/src/apicast/policy/caching/apicast-policy.json#L14-L19 Only traffic that was previously denied will be denied. Unseen traffic will be let through.

y-tabata commented 6 years ago

@mikz thanks!

The disadvantage is that the the reporting is done after batch_report_seconds and not immediately. The delay can be short, but there is a delay. And during that time the requests can go over the rate limit as mentioned in the README.

Considering the rate limit, the best way is make 3scale batcher active only when 3scale backend is down. However OIDC + token introspection policy + Keycloak role check policy + 3scale batcher policy is one of the good candidates to reduce dependency. In this way, we should define the policy chain like the following, right?

"policy_chain" : [
  { "name" : "apicast.policy.keycloak_role_check" },
  { "name" : "apicast.policy.token_introspection" },
  { "name" : "apicast.policy.3scale_batcher" },
  { "name" : "apicast.policy.apicast" },
  { "name" : "apicast.policy.caching", "configuration" : { "caching_type" : "allow" } }
]
y-tabata commented 5 years ago

With #956, we confirmed the above policy chain works well. However, we can call APIs within auths_ttl if the backend-listener is down. If the following can be achieved, we can completely reduce the dependency on backend.

y-tabata commented 5 years ago

I confirmed that when I set auths_ttl = 0, I can completely reduce the dependency on backend. I close this issue.