apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.23k stars 3.58k forks source link

PIP 97: Asynchronous Authentication Provider #12105

Open michaeljmarshall opened 3 years ago

michaeljmarshall commented 3 years ago

Motivation

Pulsar's current AuthenticationProvider interface only exposes synchronous methods for authenticating a connection. To date, this has been sufficient because we do not have any providers that rely on network calls. However, in looking at the OAuth2.0 spec, there are some cases where network calls are necessary to verify a token. As a prerequisite to adding an OAuth2.0 AuthenticationProvider (or any other provider that relies on possibly blocking behavior), I propose the addition of asynchronous methods to the AuthenticationProvider interface and the AuthenticationState interface.

Goal

Update authentication provider framework to support more authentication protocols (like OAuth2.0 or OpenID Connect) while providing a way to ensure that implementations do not block IO threads.

Target version: ~2.10~ 2.12

API Changes

In order to prevent confusion, the old authenticate methods are deprecated. Three interfaces will need to be updated for this PIP: AuthenticationProvider, AuthenticationState, and AuthenticationDataSource.

The API changes are shown in this PR: https://github.com/apache/pulsar/pull/12104

AuthenticationProvider

AuthenticationState

AuthenticationDataSource

Implementation

In addition to updating the above interfaces, we will need to update the consumers of those interfaces. The consumers (and transitive consumers) are currently the following classes: AuthenticationService, AuthenticationFilter, WebSocketWebResource, AbstractWebSocketHandler, ServerCnx, ProxyConnection, and ServerConnection.

AuthenticationService

It should be trivial to update this class, as the authenticateHttpRequest method's return type will be updated to return a CompletableFuture<String> instead of a String.

WebSocketWebResource

This class will require an update to the several endpoints so that they can return asynchronous objects. There is already a pattern in Pulsar for handling this kind of response using @Suspended final AsyncResponse asyncResponse, so I expect the implementation to be trivial.

AbstractWebSocketHandler

This class has a checkAuth method. The creation of a subscription is currently synchronous. I'm not sure how to implement asynchronous responses in websocket handlers. It looks like it should be possible from the following doc: https://webtide.com/jetty-9-updated-websocket-api/. I will need to do a little research to complete this implementation.

AuthenticationFilter

This class is an implementation of the javax.servlet.Filter interface. We will need to follow the servlet's paradigm for ensuring that asynchronous results are properly handled when determining whether to call chain.doFilter after the asynchronous authentication is complete. I will need to do a little research to complete this implementation.

ServerConnection

The ServerConnection class is part of the DiscoveryService. Instead of updating the implementation, I think it makes more sense to remove the module altogether. I started a thread about this on the mailing list here: https://lists.apache.org/x/thread.html/rc8b796dda2bc1b022e855c7368d832b570967cb1ef29e9cd18e04e97@%3Cdev.pulsar.apache.org%3E. If we do not remove the module, this class will need to be updated as well.

ServerCnx and ProxyConnection

Both of these classes rely on netty for method synchronization. Since authentication can happen in another thread, we will need to update these class's state to State.Connecting before calling the authenticateAsync methods. This will prevent a potential denial of service issue. Then, when the authenticateAsync method is called, we will need to register a callback to update the state of the connection class. I believe we will want to schedule this using each class's executor, as follows: this.ctx().executor(). By using the class's executor, we won't need to update any values to volatile. I'd appreciate confirmation that this interpretation of netty's method is true.

Implementations of Interfaces

Once the above changes are implemented, it will make sense to update our implementations of these interfaces to ensure that they implement the new methods.

Tests

All of these class changes will have associated test changes.

Reject Alternatives

On the initial mailing list thread (link in references, below), it was suggested that adding a new async interface would be easier. That would mean that the old, synchronous interface, AuthenticationProvider would be deprecated, and we would start with a fresh, new interface. In considering this solution, I decided it was suboptimal because it would lead to the creation of many new translation classes as well as duplicate fields. One such class was described on the mailing list:


class AsyncAuthBridge implements AsyncAuthenticationProvider {
      AuthenticationProvider delegate;
      Executor e;

     CompletableFuture<Boolean> authenticateAsync(AuthenticationDataSource authData) {
         CompletableFuture<Boolean> f = new CompletableFuture<>();
         e.execute(() -> {
               f.complete(delegate.authenticate(authData));
         });
         return f;
    }
}

This implementation makes sense, but since existing implementations of the AuthenticationProvider interface and the AuthenticationState interface are synchronous and should be non-blocking, I don't think we need to worry about pushing their execution to a separate thread.

Further, if we created a new interfaces named AsyncAuthenticationProvider and AsyncAuthenticationState, I think we'd end up duplicating most of the interface's fields/methods in the original synchronous interfaces because the same logic is still required. We are only changing implementation details by making the authenticate methods run asynchronously.

References

This addition was first discussed on the dev mailing list here: https://lists.apache.org/x/thread.html/r6c2522ca62242109758586696261cb1f4b4ce8e94ae593fda6e97b99@%3Cdev.pulsar.apache.org%3E.

eolivelli commented 3 years ago

@merlimat FYI

github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.

tisonkun commented 1 year ago

@michaeljmarshall this PIP said target to 2.10. Shall we close it as delivered?

michaeljmarshall commented 1 year ago

@tisonkun - this PIP was not completed. The API changes were merged (which was probably a mistake) before the implementation was ready to go. @BewareMyPower was just asking about this, too: https://github.com/apache/pulsar/pull/12104#discussion_r1046794694. I am not exactly sure how to move forward, but I will think about it a bit this week and try to figure something out.

michaeljmarshall commented 1 year ago

Quick update: I've been working on this and am getting close to completing the Netty handler implementations. The HTTP filter and the websocket implementations are a bit trickier and will need more attention.

michaeljmarshall commented 1 year ago

As far as I can tell, the only remaining tasks for this PIP are the AbstractWebSocketHandler and the AuthenticationFilter.

One problem with the websocket proxy is that it does not have the request object available in the AuthenticationFilter.

https://github.com/apache/pulsar/blob/516437e370a711d48fe1d444a0c47e64e7cf2f4b/pulsar-websocket/src/main/java/org/apache/pulsar/websocket/AbstractWebSocketHandler.java#L103-L104

I started work on the async http filter, but I ran into issues. Here is the start of that work: https://github.com/michaeljmarshall/pulsar/commit/605419fcc84703e058efac1ad54a11dacb8e1eea.