Authorization is never re-checked for STOMP topic subscriptions

rabbitmq / discussions

Please use RabbitMQ mailing list for questions. Issues that are questions, discussions or lack details necessary to investigate them are moved to this repository.

3 stars 4 forks source link

Authorization is never re-checked for STOMP topic subscriptions #171

Closed bdw429s closed 3 years ago

bdw429s commented 3 years ago

When using RabbitMQ with the STOMP plugin and a custom HTTP backend auith, if a STOMP JS client publishes multiple messages to a topic, Rabbit hits the backend auth again for every single message to ensure the client's user is still authorized for that topic. However, a STOMP JS subscription to a topic is only checked once and never again. This means if permissions change, the browser would continue getting messages with no way to force that connection to close outside of hacking around the manager HTTP API.

It is perfectly reasonable to not check authorization for topic subscriptions for every message for performance, but consider a configurable timeout which will cause the back end auth to be re-checked every X minutes when sending a message out to ensure stale connections aren't allowed to be left open in a browser indefinitely, receiving messages they should not be getting.

michaelklishin commented 3 years ago

That's why rabbitmq_auth_backend_cache exists.

I fail to see how a STOMP JS client would be different from a regular STOMP client. We need a way to reproduce, we do not guess in this community.

If the connections are made over Web STOMP, that plugin simply delegates to the "regular" STOMP plugin. This is not an actionable issue right now as defined by our team => moving to Discussions.

michaelklishin commented 3 years ago

I'm trying to reproduce the difference in behavior with stomp-py. I have a node configured like so:

log.file.level = debug
log.console.level = debug

# auth_backends.1 = cache
# auth_cache.cached_backend = http

auth_backends.1 = http

auth_http.http_method   = post
auth_http.user_path        = http://localhost:8080/auth/user
auth_http.vhost_path       = http://localhost:8080/auth/vhost
auth_http.resource_path = http://localhost:8080/auth/resource
auth_http.topic_path        = http://localhost:8080/auth/topic

and use this Spring Boot example service. The app authorizes access to topics that begin with an a.

Then I start two basic STOMP apps, a publisher

#!/usr/bin/env python3

import stomp
import time
import random

conn = stomp.Connection([('127.0.0.1', 61613)])
conn.connect('guest', 'guest', wait=True)

while True:
    try:
        n = str(random.randint(0, 99))
        conn.send(destination="/topic/abc." + n, body="abc." + n)
        time.sleep(1)
    except KeyboardInterrupt:
        print("Terminating in 1s...")
        time.sleep(1)
        conn.disconnect()

and a consumer

#!/usr/bin/env python3

import stomp
import time

class MyListener(stomp.ConnectionListener):
    def on_error(self, headers, message):
        print('received an error "%s"' % message)

    def on_message(self, headers, message):
        print('received a message "%s"' % message)

conn = stomp.Connection([('127.0.0.1', 61613)])
conn.connect('guest', 'guest', wait=True)

conn.set_listener('', MyListener())
conn.subscribe(destination='/topic/abc.*' , id=1, ack='auto')

time.sleep(600)

the publisher publishes once a second. The service then logs

2021-01-05 16:05:36.292  INFO 76134 --- [nio-8080-exec-2] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc-46'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 16:05:37.294  INFO 76134 --- [nio-8080-exec-3] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc-55'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 16:05:39.298  INFO 76134 --- [nio-8080-exec-4] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc-43'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 16:05:40.299  INFO 76134 --- [nio-8080-exec-5] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc-76'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 16:05:41.299  INFO 76134 --- [nio-8080-exec-6] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc-9'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 16:05:42.302  INFO 76134 --- [nio-8080-exec-7] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc-30'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 16:05:43.306  INFO 76134 --- [nio-8080-exec-8] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc-52'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 16:05:44.308  INFO 76134 --- [nio-8080-exec-9] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc-37'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true

lukebakken commented 3 years ago

@michaelklishin - Brad (@bdw429s) is expecting authorization to be checked when messages are delivered to a subscriber. I explained in the rabbitmq-users thread that once a consumer is registered via basic.consume, no further checks are made as long as the channel / connection are open.

I also explained that this is how the AMQP protocol works in RabbitMQ, which the STOMP plugin uses, and how it is very unlikely this feature request would be implemented. My $0.02.

michaelklishin commented 3 years ago

OK. FWIW this makes sense semantically to me. There are two permission checks:

Once when a consumer subscribes
Once when a message is published

and in most cases this how people would expect it to work. If it's necessary to revoke permissions of a user, deleting the user will close all of its active connections, or they can be closed using CLI tools or the HTTP API.

FTR I can confirm that the behavior is the same for a long running pair of STOMP-over-WebSocket clients in JS:

WebSocket = require('ws');
const StompJs = require('@stomp/stompjs');

const client = new StompJs.Client({
    brokerURL: 'ws://localhost:15674/ws',
    connectHeaders: {
      login: 'guest',
      passcode: 'guest',
    },
    debug: function (str) {
      // console.log(str);
    },
    reconnectDelay: 5000,
    heartbeatIncoming: 4000,
    heartbeatOutgoing: 4000,
  });

client.onConnect = function (frame) {
    callback = function (message) {
        if (message.body) {
          console.log('got message with body ' + message.body);
        } else {
          console.log('got empty message');
        }
      };
    client.subscribe('/topic/abc.ws.*', callback);
};

client.onStompError = function (frame) {
    // Will be invoked in case of error encountered at Broker
    // Bad login/passcode typically will cause an error
    // Complaint brokers will set `message` header with a brief message. Body may contain details.
    // Compliant brokers will terminate the connection after any error
    console.log('Broker reported error: ' + frame.headers['message']);
    console.log('Additional details: ' + frame.body);
};

client.activate();

WebSocket = require('ws');
const StompJs = require('@stomp/stompjs');

const client = new StompJs.Client({
    brokerURL: 'ws://localhost:15674/ws',
    connectHeaders: {
      login: 'guest',
      passcode: 'guest',
    },
    debug: function (str) {
      // console.log(str);
    },
    reconnectDelay: 5000,
    heartbeatIncoming: 4000,
    heartbeatOutgoing: 4000,
  });

client.onConnect = function (frame) {
    console.log("Connected.");

    client.publisherTimer = setInterval(function() {
        let n = Math.floor(Math.random() * Math.floor(100));

        client.publish({
            destination: '/topic/abc.ws.' + n,
            body: 'abc.ws.' +  + n
          })
    }, 1000);
};

client.onStompError = function (frame) {
    // Will be invoked in case of error encountered at Broker
    // Bad login/passcode typically will cause an error
    // Complaint brokers will set `message` header with a brief message. Body may contain details.
    // Compliant brokers will terminate the connection after any error
    console.log('Broker reported error: ' + frame.headers['message']);
    console.log('Additional details: ' + frame.body);
};

client.activate();

and the authZ service log proves every published message results in a topic permission check

2021-01-05 17:52:03.163  INFO 76661 --- [nio-8080-exec-4] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc.ws.42'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 17:52:04.164  INFO 76661 --- [nio-8080-exec-5] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc.ws.15'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 17:52:05.166  INFO 76661 --- [nio-8080-exec-6] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc.ws.35'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 17:52:06.172  INFO 76661 --- [nio-8080-exec-7] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc.ws.84'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true
2021-01-05 17:52:07.175  INFO 76661 --- [nio-8080-exec-8] c.r.examples.AuthBackendHttpController   : Checking topic access with TopicCheck{routing_key='abc.ws.45'} ResourceCheck{resource='topic', name='amq.topic', permission='write'} TopicCheck{username='guest', vhost='/'}, result: true

bdw429s commented 3 years ago

That's why rabbitmq_auth_backend_cache exists.

@michaelklishin No it's not. If you read my issue carefully, I'm not asking about a scenario in which Rabbit hits my backend HTTP API, I'm asking about a scenario in which is doesn't!

I fail to see how a STOMP JS client would be different from a regular STOMP client

I don't think anyone said it was, so I'm not sure why you're hung up on that. In fact, @lukebakken 's messages on the mailing list indicated this behavior isn't even specific to STOMP at all. I've included those details simply for context.

We need a way to reproduce, we do not guess in this community.

This is not a bug report, this is an enhancement request to modify how Rabbit behaves in regards to checking permissions when sending a message to a topic subscriber. If you wanted to see this behavior (which has already been confirmed by your colleague Luke) then I would recommend:

Configure a backend HTTP auth that will log all requests made to its API
Grab the demo "chat app" from the repo your docs point to ( https://github.com/jmesnil/stomp-websocket/tree/master/example/chat )
Authenticate two users in two different browser windows.
Send a message from one user
Confirm that the backend HTTP auth was consulted whether the user publishing the message had permissions to do so
Also confirm that the user receiving the message did NOT consult the backend HTTP API

If it's unclear, step 6 is the one being discussed here.

If the connections are made over Web STOMP, that plugin simply delegates to the "regular" STOMP plugin.

What does this sentence have to do with the issue? Again, the fact I'm using STOMP is more for context. The behavior being discussed is, from my understanding, core to how Rabbit sends messages to a subscriber over basic.consume.

This is not an actionable issue right now as defined by our team => moving to Discussions.

What? It seems clear you don't even understand what is being asked. Can you please re-open this ticket until we're done discussing it instead of simply closing it without waiting for a reply. I was asked by @lukebakken to enter this ticket on the Rabbit mailing list and you have just closed it without any discussion and, from what I can tell, not even reading the full thread on the list (which I linked to in the issue description).

bdw429s commented 3 years ago

If it's necessary to revoke permissions of a user, deleting the user will close all of its active connections, or they can be closed using CLI tools or the HTTP API.

@michaelklishin Again, there are two issues with this suggestion.

I'm not using internal users, I'm using a custom backend HTTP Auth, as I mentioned in my original post. The management HTTP API's /api/users/ endpoint only appears to operate on internal users
While I appreciate the workaround, my question is also asking why Rabbit doesn't have a mechanism to re-check the auth itself, especially given the behavior where it re-checks permissions every time it allows a client to publish a message.

Even if the management API did work in this case, disconnecting the entire user is heavy handed. In my case, I have a web app with more than one topic subscription over STOMP (around 30 or so) and only one of them may be invalid due to permissions changing. Why would I want to disconnect the entire user when they only one of their subscriptions was invalid?

FTR I can confirm that the behavior is the same for a long running pair of STOMP-over-WebSocket clients in JS:

Great, can you please reinstate this ticket now? If the feature I requested is declined, I'd like it to at least be done so on the merits (or lack thereof) of what I was asking for, not because you didn't understand what was being requested.

bdw429s commented 3 years ago

@lukebakken @michaelklishin It's been over a week, can we please get this ticket re-opened and moved back to issues where Luke asked me to create it in the first place so its merits can be discussed?

lukebakken commented 3 years ago

@bdw429s we understand your requirement. You would like permissions re-checked at some interval when messages are delivered to consumers. This wouldn't just affect STOMP as delivery actually happens at the "AMQP layer" within RabbitMQ.

While I appreciate the workaround, my question is also asking why Rabbit doesn't have a mechanism to re-check the auth itself, especially given the behavior where it re-checks permissions every time it allows a client to publish a message

The reason is historical. Permissions have never been re-checked for deliveries to consumers. In my time on the project (3.5 years) this is the first time this feature has been requested to my knowledge.

At this point I will leave it up to @michaelklishin and @dcarwin-pivotal to chime in on whether to open this as a feature request.

bdw429s commented 3 years ago

@lukebakken Thank you for confirming the request. If the team decides this is not a feature they want to pursue, I would strongly suggest some more thought and mechanics be put into how I can disconnect a user (or ideally, just the topic subscription) whose permissions are no longer valid. This was a big concern for my client and the existing workarounds provided all seem a little limited in terms of what users they can affect or heavy-handed in killing off the entire connection when only a single topic may be no longer valid.

lukebakken commented 3 years ago

Please note that there are support and development channels available to pay for a feature to be implemented (https://www.rabbitmq.com/#support). For instance, see the "Sponsors" section of pull request #2654.

We don't (yet) have a way to enumerate feature requests so that users can vote on them. I'll bring it up with the team.

michaelklishin commented 3 years ago

@bdw429s you use a very specific combination of features, and have very specific expectations about how things should work. In open source software, if a certain scenario is an edge case, it often won't be prioritised, in particular when the request comes from a user who is not paying.

Topic authorization has been around since 3.7, and this discussion of how it is applied to consumers is maybe the second time it was brought up. There is a very real cost to reloading permissions and a lot of users are content with the way things work. They are not interested in taking a throughput hit, or see every connection use more memory because of extra topic permission caches. So this is not really the kind of issue that seems high priority for the community as a whole. Our team is small and we have to pick what we spend our time on wisely.

michaelklishin commented 3 years ago

Update: after a much more civil discussion with @bdw429s, we are revoking the ban.

michaelklishin commented 3 years ago

A quick attempt at designing an "ongoing" topic authorization re-check for active consumers (which are registered once but whose permissions may change) brings up some non-trivial problems.

Publishers use Specific Topics, Consumers Use Patterns

A topic permission is a pattern. Publishers use specific topics when publishing, and checking a specific topic against a pattern is well defined in the specs and something that several protocol implementations we support already do using tries of segments.

However, consumers often consume using patterns as well. Checking a pattern against a pattern is not something well defined and not something we already do. There are real risks to consider this, from meaningful throughput overhead to simply not getting it right as it is not an obvious behavior for implementers and users alike.

So topic authorization that is enforced only on publishers was not a coincidence.

Consumers are Unaware of Publishers

If we chose to take a specific topic from a message about to be delivered, and use that for topic authorization enforcement, we would tie consumers to publishers in ways that some protocols we support try to avoid. Now consumers would have to be aware of certain details of how publishing was done, including certain bits that are not really semantically present in STOMP and are implementation details. It's not obvious to me that this feature would be worth the complication.

Potentially Non-trivial Changes to Hot Code Path

If we chose to take a specific topic from a message about to be delivered, and use that for topic authorization enforcement, we would introduce changes to one of the most sensitive code paths there are. It will potentially affect a very large percentage of the user base, including those who do not use topic permissions.

No Clear Way to Communicate an Error

In some protocols, there is a way to communicate an error to a client. STOMP is one of them. Unfortunately in MQTT 3.1 this is not possible at all (there is no server-to-client equivalent of an ERROR frame). In AMQP 0-9-1, there is a suitable mechanism but most clients do not really expect to be notified of authorization operation failures for running consumers, and using something like consumer cancel notifications can be misinterpreted by both clients and existing applications.

Conclusion

Addressing this carries very real risks that may be more important than the gain.