zalando / nakadi

A distributed event bus that implements a RESTful API abstraction on top of Kafka-like queues
https://nakadi.io
MIT License
952 stars 292 forks source link

allow fallback to remote on IO error #1537

Closed samizzy closed 1 year ago

samizzy commented 1 year ago

Description

In Nakadi, there is logic to retry load token info first from local token and then fallback to remote token info on failure. This logic does not account for when the local token info fails with timeout/IO exception. This PR introduces fix so that it accounts for IO exception and retries remote token info.

Review

adyach commented 1 year ago

👎 lets find other way to achieve the same

samizzy commented 1 year ago

@adyach Open to suggestions, do you have something in mind? Just for context, the interface TokenInfoRequestExecutor (snippet below) doesnt allow to throw exceptions and every implementation class returns the error with key error. So there is no possibility to catch IO exceptions as such. The caller of this method getMap is TokenInfoResourceServerTokenServices#loadAuthentication which throws all error messages with InvalidTokenException

public interface TokenInfoRequestExecutor {

    Map<String, Object> getMap(final String accessToken);

}
adyach commented 1 year ago

from the quick look there are many options: 1) TokenInfoResourceServerTokenServices is a part of the Zalando libraries, it could be a way to improve overall 2) You can always extend TokenInfoResourceServerTokenServices and change the logic (yes, some duplication of the code) 3) There is also extension NakadiTokenInfoRequestExecutor which overrides getMap, if there is not other option I would go with adding more sounding error description to catch. ...

adyach commented 1 year ago

and to the why I think we should do it in the other way:

  1. the error message can change in the future, and the team maintaining Nakadi will never know that part of code does not work anymore.
  2. IO error can be due to different reasons, one of them is timeout. It potentially servers for other errors which you are not aware of. This will increase the load on other endpoints.
  3. This code snippet to check the error does not look like production grade code, it is more of the hack or test code, which I expect to go away after some testing
if(!ex.getMessage().contains("I/O error")) {
     throw ex;
 }
samizzy commented 1 year ago

from the quick look there are many options:

  1. TokenInfoResourceServerTokenServices is a part of the Zalando libraries, it could be a way to improve overall
  2. You can always extend TokenInfoResourceServerTokenServices and change the logic (yes, some duplication of the code)
  3. There is also extension NakadiTokenInfoRequestExecutor which overrides getMap, if there is not other option I would go with adding more sounding error description to catch. ...

I have gone with suggestion 3 here, although its still catching general IOException.

adyach commented 1 year ago

some minor comments to fix otherwise better than before 👍

adyach commented 1 year ago

👍

samizzy commented 1 year ago

:+1: