Closed kleunen closed 2 years ago
An alternative would be that on a subscribe you calculate which filters the session is actually allowed to subscribe. So you combined the authorization filter + requested topic filter into an authorized topic filter:
Auth: level1/level2/level3
Subscribe: level1/level2/level3
Result: level1/level2/level3
Auth: level1/+/level3
Subscribe: level1/level2/level3
Result: level1/level2/level3
Auth: level1/level2/level3
Subscribe: level1/+/level3
Result: level1/level2/level3
Auth: level1/+/level3
Subscribe: level1/+/level3
Result: level1/+/level3
Auth: level1/level2/#
Subscribe: level1/level2/#
Result: level1/level2/#
Auth: level1/level2/#
Subscribe: level1/level2/level3/+
Result: level1/level2/level3/+
Auth: level1/level2/level3/+
Subscribe: level1/level2/#
Result: level1/level2/level3/+
Auth: level1/level2/#
Subscribe: level1/level2/level3/level4
Result: level1/level2/level3/level4
Auth: level1/level2/+
Subscribe: level1/level2/level3/level4
Result: Not allowed
Auth: level1/level2/level3/level4
Subscribe: level1/level2/+
Result: level1/level2/level3/level4
Auth: level1/+/level3/+
Subscribe: level1/level2/+/level4
Result: level1/level2/level3/level4
Auth: level1/level2/+
Subscribe: nomatch/level2/+
Result: Not allowed
Auth: level1/+/level3/+
Subscribe: level1/+/+/level4
Result: level1/+/level3/level4
I think the rules for this should be:`
<literal> + <literal> = <literal> (if literals are equal, otherwise not allowed)
<literal> + '#' = <literal> (remaining path also part of subscription)
'#' + <literal> = <literal> (and remaining path is allowed)
'#' + '#' = '#'
<literal> + '+' = <literal>
'+' + <literal>= <literal>
'+' + '+' = '+'
'#' + '+' = '+' (and remaining path is allowed)
'+' + '#' = '+' (and remaining path is allowed)
number of levels should be equal _or_ # should be used
Like so: https://wandbox.org/permlink/xH31CMPvYS1xpuBA
#include <algorithm>
#include <iostream>
static constexpr char topic_filter_separator = '/';
template<typename Iterator>
Iterator topic_filter_getnexttoken(Iterator first, Iterator last)
{
return std::find(first, last, topic_filter_separator);
}
template<typename Iterator, typename Output>
inline void topic_filter_tokenizer(Iterator first, Iterator last, Output write) {
auto pos = topic_filter_getnexttoken(first, last);
while (write(first, pos) && pos != last) {
first = std::next(pos);
pos = topic_filter_getnexttoken(first, last);
}
}
bool is_hash(std::string const &level) { return level == "#"; }
bool is_plus(std::string const &level) { return level == "+"; }
bool is_literal(std::string const &level) { return !is_hash(level) && !is_plus(level); }
std::optional<std::string> authorize_subscribe(std::string const &authorized_filter, std::string const &subscription_filter)
{
std::optional<std::string> result;
auto append_result = [&result](std::string const &token) {
if (result) {
result.value() += topic_filter_separator + token;
} else {
result = std::optional<std::string>(token);
}
};
auto filter_begin = authorized_filter.begin();
auto subscription_begin = subscription_filter.begin();
while (filter_begin < authorized_filter.end() && subscription_begin < subscription_filter.end()) {
auto filter_end = topic_filter_getnexttoken(filter_begin, authorized_filter.end());
std::string auth = std::string(filter_begin, filter_end);
filter_begin = std::next(filter_end);
auto subscription_end = topic_filter_getnexttoken(subscription_begin, subscription_filter.end());
if(subscription_begin == subscription_filter.end()) {
return std::optional<std::string>();
}
std::string sub = std::string(subscription_begin, subscription_end);
subscription_begin = std::next(subscription_end);
if (is_hash(auth)) {
append_result(sub);
while (subscription_begin < subscription_filter.end()) {
auto subscription_end = topic_filter_getnexttoken(subscription_begin, subscription_filter.end());
append_result(std::string(subscription_begin, subscription_end));
subscription_begin = std::next(subscription_end);
}
return result;
}
if (is_hash(sub)) {
append_result(auth);
while(filter_begin < authorized_filter.end()) {
auto filter_end = topic_filter_getnexttoken(filter_begin, authorized_filter.end());
append_result(std::string(filter_begin, filter_end));
filter_begin = std::next(filter_end);
}
return result;
}
if (is_plus(auth)) {
append_result(sub);
} else if (is_plus(sub)) {
append_result(auth);
} else
{
if (auth != sub) {
return std::optional<std::string>();
}
append_result(auth);
}
}
if ( filter_begin < authorized_filter.end() || subscription_begin < subscription_filter.end()) {
return std::optional<std::string>();
}
return result;
}
std::vector<std::string> authorize_subscribe(std::vector<std::string> const &authorized_filters, std::string const &subscription_filter)
{
std::vector<std::string> result;
for(auto const& i: authorized_filters) {
auto topic_filter = authorize_subscribe(i, subscription_filter);
if (topic_filter) {
result.push_back(topic_filter.value());
} else
{
std::cout << "Not authorized" << std::endl;
}
}
return result;
}
int main()
{
std::vector<std::string> authorized_filters = {
"example/#"
};
auto result = authorize_subscribe(authorized_filters, "example/value/a");
for(auto const& i: result) {
std::cout << i << std::endl;
}
authorized_filters = {
"example/value/a"
};
result = authorize_subscribe(authorized_filters, "example/#");
for(auto const& i: result) {
std::cout << i << std::endl;
}
authorized_filters = {
"example/value/+"
};
result = authorize_subscribe(authorized_filters, "example/value/test");
for(auto const& i: result) {
std::cout << i << std::endl;
}
result = authorize_subscribe(authorized_filters, "example/value");
for(auto const& i: result) {
std::cout << i << std::endl;
}
authorized_filters = {
"example/value/test"
};
result = authorize_subscribe(authorized_filters, "example/value/+");
for(auto const& i: result) {
std::cout << i << std::endl;
}
result = authorize_subscribe(authorized_filters, "example/value");
for(auto const& i: result) {
std::cout << i << std::endl;
}
}
But problem with this approach is, you may get multiple matching filters:
authorized_filters = {
"example/+/test/+",
"example/value/+/value"
};
result = authorize_subscribe(authorized_filters, "example/value/+/value");
for(auto const& i: result) {
std::cout << i << std::endl;
}
Results in the following topics to be authorized and to be subscribed:
example/value/test/value
example/value/+/value
In case a publish happens to 'example/value/test/value', this will result in two subscriptions matching. So you have to filter out this duplicate and also make sure that when an unsubscribe is received, these two subscriptions are removed. Advantage of this approach is that no 'unauthorized' filters are stored in the subscription_map.
mosquitto has its ACL. http://www.steves-internet-guide.com/topic-restriction-mosquitto-configuration/
If it is good enough, we should use it because it already exists and used. User can use existing ACL for mosquitto without conversion. It is a big advantege.
I'm not sure mosquitto ACL is good enough. What do you think?
mosquitto has its ACL. http://www.steves-internet-guide.com/topic-restriction-mosquitto-configuration/
If it is good enough, we should use it because it already exists and used. User can use existing ACL for mosquitto without conversion. It is a big advantege.
I'm not sure mosquitto ACL is good enough. What do you think?
I did not see your reply.
Well, the biggest choice to make is:
or:
In this case, no unauthorized subscriptions will be in the subscription map (which is nice, i think) But, a single subscribe, may result in multiple entries in the subscription map (which you need to filter out when a message matches multiple subscriptions).
Consider the following example:
User U1 connected to the server and subscribes to topic: example/#
The ACL for user U1 specified: example/a (read) example/b (read)
Option 1: When you do ACL check on publish, the subscribe U1:example/# is added to the subscription map.
And when a message is published to: example/c
User U1 is subscribed, the ACL for U1 is checked, and U1 will not receive the message
Option 2: When you do ACL check on subscribe, the subscribe U1:example/#
Will be combined with the ACL of user U1 (example/a, example/b) and the following entries are added to the subscription map: U1:example/a U1:example/b
In case example/c is published, user U1 is not matched/subscribed to this topic. No ACL check needs to be done on publish
I'm not sure mosquitto ACL is good enough. What do you think?
What I don't like about mosquittos file format is that you need to write a parser for this format.
If you use json or ini, you can use boost property tree to parse: Boost Property tree
For example in mosquitto you can configure:
topic read $SYS/#
user roger
topic foo/bar
pattern write $SYS/broker/connection/%c/state
In json this would be:
{
"acl": {
"general": [{
"topic": "$SYS/#",
"rights": "read"
}],
"client": [{
"pattern": "$SYS/broker/connection/%c/state",
"rights": "write"
}],
"user": [{
"roger": {
"password": "1234",
"acl": [{
"topic": "foo/bar",
"rights": "readwrite"
}]
}
}]
}
}
Although parsing mosquitto acl fotmat is not that difficult
Sorry, I'm busy on other projects. I will back to mqtt_cpp but I need time.
Just leave some comments for now.
I used to use Boost Property Tree but I don't think it is a good choice. The design is pretty old-fashioned and not well maintained recently. If we use json library there are some of better one.
If we need to write parser, Boost.Spirit X3 is good choice for me.
I wrote msgpack parser using it.
See
https://github.com/msgpack/msgpack-c/blob/cpp_master/include/msgpack/v2/x3_parse.hpp
And other files start with x3_
in the same directory.
I guess that mosquitto ACL is easier than msgpack parsing.
I just want to comment that writing parser is not a problem. It is a natural and straight forward way.
I will read carefully https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-768944957 later. I don't read it yet.
No problem. I am also quite busy recently.
Well, the mosquitto ACL format is easy to parse. You can just read lines and use a tokenizer to read them. One thing I don't like is that it does not have clear "sections".
So you say 'user roger', and all following topics belong to this user:
user roger
topic foo/bar
user wouter
topic yadie
But if I accidently delete 'user wouter', suddenly 'topic yadie' belongs to 'user roger'.
Also, what is missing is to limit the pattern of the client id for a user. For example, if user 'roger' logs in, he can only select a client id starting with 'roger_'. This prevents users from having a client_id collision.
But it is convenient to have the same format as mosquitto, even if you have more options.
JSON is quite verbose, if property_tree is not an option, There is Boost.JSON, but it is only from version 1.75: https://www.boost.org/doc/libs/1_75_0/libs/json/doc/html/index.html
And adding another dependency to the project for a JSON parser, i don't think that is a good idea.
I use property_tree a lot for configuration files, it is convenient. Have a config file parser in just a couple of lines. I actually added parsing the uci format to property_tree: UCI
Or maybe yaml format ? YAML
If we need to write parser, Boost.Spirit X3 is good choice for me.
Ah yes, Spirit X3 uses EBNF, that is quite easy to write. Seems quite straightforward, especially for small config format.
Having the configuration in json is convenient, because it is a defacto standard and many tools and libraries exists to create and edit them: https://linuxhint.com/bash_jq_command/
Sorry to have kept you waiting for a long time. I had busy days but now, I am back.
I'd like to confirm the concept of authentication and authorization. I think that they are separated topic. I use the word "authentication" as CONNECT permission. The broker somehow checks the connecting client has the rights for connection. I use the word "authorization" as PUBLISH/SUBSCRIBE/UNSUBSCRIBE permission. The broker somehow checkes the connected client has the rights for PUBLISH and/or [UN]SUBSCRIBE.
There are out of MQTT spec. So we need to decide our broker's spec.
I'd like to start with authentication because it is earlier phase than authorization.
MQTT has Client Identifier(mandatory but empty string has special meaning), Username(optional), and Password(optional). They are in CONNECT packet. In addition, since MQTT v5.0, AUTH packet is introduced. Before MQTT CONNECT packet is sent, we could use client certificate authentication on the TLS layer. Recently, I added an example for client certificate authentication. https://github.com/redboltz/mqtt_cpp/blob/d163f887cfa06a950ec7ba3bfd22279680a567d0/example/tls_both_client_cert.cpp#L347-L367
The spec said that Client Identifier must be unique in the broker.
Each Client connecting to the Server has a unique ClientID.
https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901059
It's very important. If the client that has the same ClientID of the existing connected client, then only one of them is accepted. The typical implementation is disconnecting the existing client and accepting the new client. Our broker behaves so.
ClientID could be empty string. In this case, the broker generates unique string and regard it as the ClientID.
On MQTT v5.0, the generated ClientID is notified the client using 3.2.2.3.7 Assigned Client Identifier
property.
https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc511988564
On MQTT v3.1.1, there is no way to notify the generated ClientID to the client. http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc385349242
Because of this limitation, Clean Session:0 is not allowed with the empty ClientID on MQTT v3.1.1.
If the Client supplies a zero-byte ClientId, the Client MUST also set CleanSession to 1 [MQTT-3.1.3-7].
If the Client supplies a zero-byte ClientId with CleanSession set to 0, the Server MUST respond to the CONNECT Packet with a CONNACK return code 0x02 (Identifier rejected) and then close the Network Connection [MQTT-3.1.3-8].
If the Server rejects the ClientId it MUST respond to the CONNECT Packet with a CONNACK return code 0x02 (Identifier rejected) and then close the Network Connection [MQTT-3.1.3-9].
We can choose multiple authentication method candidates.
The first candidate is a combination of Username
and Password
.
It's simple but has the following problem.
The broker disconnects Client1 because Client2 has the ClientID:abc. Client1 and Client2 could be independent. In this case, they don't know each other. How to avoid ClientID conflict?
Enforce clients to create UUID and set it as the ClientID for the clients? It is difficlut for embedded (only have poor resornces) devices. That kind devices sometimes don't have system clock. And the system clock is usually used creating random UUID.
Enforce clients to set the only empty ClientID? In this case, the broker can create the unique ClientID. However, on MQTT v3.1.1, clients can use Clean Session:1 only. It is unacceptable. One of big advantage of using MQTT is session continuation. Note that the session (subscription status, inflight published messages, and offline published messages) are distinguished by ClientID.
How about using ClientID for authentication instead of Username?
It works expectedly.
Any ideas?
I think it is best to always user the username for authentication. A username/password combination may be used to connect multiple times. Like you said, a unique ClientID can be connected only once. But it is useful to use the username/password combination to connect multiple devices.
You can limit the clientid to a certain pattern in many brokers. So if a user logs in with 'username' the valid clients starts with 'username-', so for example: username-0000, username-0001, username-0002 are all valid client ids for this user. This way, no collision between clientids of different users can occur.
For embedded devices i would recommend using the MAC Address or some unique microcontroller identifier. Then the clientid would become:
username-device.mac
Username-ClientID
is interesting idea.
Let me clarify what you mean.
In the broker (mqtt_cpp broker), the internal ClientIDs are u1-abc
and u2-abc
. They are different.
So both connections are accepted.
Is that right?
Exactly.
In mosquitto you have patterns like '%u' which will match the username, so you use them to limit the client id. So one can configure how the clientid handling should look like. I don't think they have a generated pattern, but you might define one '%g'.
So: clientid = %u clientid = %g clientid = %u-%g
Ok. I like unconflictable ClientID management. Let me confirm it. ClientID is a little bit ambiguous. One is the Client Identifier field in the CONNECT packet. It is sent from the client. Let's call it scid (sent ClientID). The other is broker internal identifiler. Let's call it bcid (broker ClientID). bcid is used for the client conflict checking.
The client send CONNECT { Username:"u1", Password:"p1", Client Identifier:"abc" }. In this case, scid: "abc" bcid: "u1-abc" Only "u1" and "p1" are used for authentication .
The client send CONNECT { Username:"u1", Password:"p1", Client Identifier:"" }. In this case, scid: "" bcid: "u1-generated_string_by_broker" Only "u1" and "p1" are used for authentication .
So far, it is good for me. I have two questions.
generated_string_by_broker
. Because if the client want to inherit the (generated) session on the re-connection, the client would send CONNECT { Username:"u1", Password:"p1", Client Identifier:"generated_string_by_broker" }. The broker would create bcid as u1-generated_string_by_broker
from the CONNECT packet. It is simple behavior. What do you think?I dont think it is a good idea to have an internal and external client id. The client id is also used to continue a persistent session on the broker. So the client may reconnect and identify using the client id and request continuation of a stored session. That is also why the previous possibly already broker connection gets disconnected.
I dont think it is a good idea to have an internal and external client id. The client id is also used to continue a persistent session on the broker. So the client may reconnect and identify using the client id and request continuation of a stored session. That is also why the previous possibly already broker connection gets disconnected.
I guess that you don't understand what I mean. Because my expression is not well...
Username-ClientID
is interesting idea.Let me clarify what you mean.
- Client1 CONNECT using Username:u1, Password:p1, ClientID:abc
- Client2 CONNECT using Username:u2, Password:p2, ClientID:abc
In the broker (mqtt_cpp broker), the internal ClientIDs are
u1-abc
andu2-abc
. They are different. So both connections are accepted. Is that right?
I think that you have agreed to the idea above. I mean
client | Username | scid | bcid |
---|---|---|---|
client1 | u1 | abc | u1-abc |
cleint2 | u2 | abc | u2-abc |
Do you mean something different?
Ah ok i understand now. Internally you map the client id to a username. So the client wont see this.
Only thing is that when you reconnect with a different username and like to continue a session. This is not possible. But i dont think that is an issue
Ah ok i understand now. Internally you map the client id to a username. So the client wont see this.
Only thing is that when you reconnect with a different username and like to continue a session. This is not possible. But i dont think that is an issue
Perfect! That is I wanted to say!
Let's move on the next topic. It is Client Certificate authentication. Boost Asio has Client Certificate authentication functionality. The server(broker) can extract CNAME of the client. (Maybe can also extract Alternative Domain Names) I think that CNAME (our alt names) should be Username. The Client Certificate is signed by CA. The broker set CA certificate as verify file. So if the Client Certificate signature is verified by the broker, then authentication is success. Important point is Username should be written is the Client Certificate.
In this case, MQTT's Username and Password shouldn't be used. If the client set them, then disconnect as error or outputs warning logs and ignore them.
ClientID (scid) is used for making multiple sessions using the same authentication. It is the same as Username, Password, and ClientID(scid) pattern.
It this OK?
Maybe it is effected the ACL.
For example, if the user (client) "redboltz" is used Client Certificate then the ACL becomes something like as follows:
{
"acl": {
"user": [{
"roger": {
"password": "1234"
},
"redboltz": {
"passowrd": "$SYS/ClientCertificate"
},
"passowrd": "$SYS/ClientCertificate"
means use Client Certificate authentication instead of password authentication. Or
"Client Certificate Auht": true
Anyway, it is semantically the same.
ClientID (scid) is used for making multiple sessions using the same authentication. It is the same as Username, Password, and ClientID(scid) pattern.
It this OK?
Yes. Certificate is alternative for username/password
Ok. The rest authentication method is AUTH packet. but It is only for MQTT v5.0. And the spec not said about in detail of AUTH packet. So I think that we can postpone it.
Let's move on authorization step by step.
As we discussed, authentication uses Username and Password, and doesn't use ClientID.
I think that authorization should be defined between topic name (or topic filter) and Username.
If Username u1 has read permission of topic name t1, then any clients that have the username u1 (and may have different ClientIDs) can subscribe t1. I think that it is reasonable. ClientID could be automatically generated on runtime, it is not suitable for permission control.
What do you think?
If Username u1 has read permission of topic name t1, then any clients that have the username u1 (and may have different ClientIDs) can subscribe t1. I think that it is reasonable. ClientID could be automatically generated on runtime, it is not suitable for permission control.
What do you think?
yes, agreed. Authentication of topics is based on username or possibly groups of usernames.
Ok, so far, authorization is defined as Username and topic
relationship.
I've read https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-768944957 but I'd like to discuss based on my example in order to clarify my understanding.
topic
is still ambiguous.
MQTT defines Topic Name and Topic Filter.
We can say authorization is Username and Topic Name relashonship. How about Topic Filter?
Let's say there are following Topic Name:
example/a
example/b
They also are Topic Filter
And there is Topic Filter
example/#
Authorization table
Username | Topic (Name or Filter) | Type | Meaning |
---|---|---|---|
u1 | example/a | read(subscribe) | u1 can subscribe example/a |
In this case, u1 can subscribe example/a
but cant subscribe example/b
. So far, very simple.
What happens u1 subscribe exmaple/#
?
If we allow only Topic Name in authorization table, then exmaple/#
should be accepted. And then, move on discussion https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-768944957 .
But before do that, I'd like to consider the following option.
Username | Topic (Name or Filter) | Type | Meaning |
---|---|---|---|
u1 | example/a | read(subscribe) | u1 can subscribe example/a |
u1 | example/# | read(subscribe) | u1 can subscribe example/# |
This is authorization table contains Topic Filter case.
And in this case, u1 can subscribe example/# because the (exactly) same topic filter is in the table.
If the table is |
Username | Topic (Name or Filter) | Type | Meaning |
---|---|---|---|---|
u1 | example/a | read(subscribe) | u1 can subscribe example/a |
then u1 can't subscribe example/#
.
I don't consider deeply yet. If you find some problems in the approach, please let me know.
If the table is
Username Topic (Name or Filter) Type Meaning u1 example/a read(subscribe) u1 can subscribe example/a then u1 can't subscribe
example/#
.
I think in this case u1 can subscribe to example/#, but will only receive published messages to example/a (and not example/b).
Because according to mqtt specification, can a subscribe result in a not-authorized response ? If a client subscribes to example/#, is the broker allowed to respond: unauthorized subscription ?
ah yes, it seems it is:
135 | 0x87 | Not authorized | The Client is not authorized to make this subscription. |
---|
That does make handling of authorization easier, because you can check on subscription if it is allowed.
But it can be convenient to subscribe to '#' and only receive the topics you are authorized to receive.
But it can be convenient to subscribe to '#' and only receive the topics you are authorized to receive.
Make sense. Before deciding the options, I'd like to consider another issue.
Can authentication table and/or authorization table be updated on runtime? If we can update them, what happens existing connections and subscriptions?
I would say: yes they can be updated on runtime. Especially if you start adding a "plugin" based authentication system. For example if the authentication rules come from a SQL database. Clearly, they are dynamic. User might like to add or ban users and update authentication for a user, while the broker is running.
I would say, the updated rules only apply to new sessions (or new connections?). So on startup of a session, the topic authorization is requested for the given username.
Thank you for the comment. I understand we have some choices. I think that avoiding inconsistent state is important. So far, it seems that there is no problem with both applying existing connections (disconnect from the broker, force unsubscribe by the broker, if updated event can be detected) and applying only new connections.
Of course we need some implementation for updating.
Let's back to the authorization issue.
Based on our discussion, I considered the following candidates.
Authorization table can contain only Topic Name. Topic Filter is not allowed.
On SUBSCRIBE received {
if (Topic Filter contains wildcard) {
accept the subscription.
}
else { // Topic Filter is Topic Name
lookup authorization table(read)
if (entry exists) {
accept the subscription.
}
else {
reject the subscription.
}
}
}
On PUBLISH received {
lookup authorization table(write)
if (entry exists) {
accept the publish. (and continue)
}
else {
reject the publish.
return
}
get matched deliver target subscribers (already implemented)
if (matched subscription is wild card) { // I'm not sure it can be checked or not.
lookup authorization table(read)
if (entry exists) {
deliver PUBLISH packet to the subscriber
}
}
else { // no read checking required here
deliver PUBLISH packet to the subscriber
}
}
Authorization table can contain only Topic Name and Topic Filter.
On SUBSCRIBE received {
lookup authorization table(read) // wildcard topic filter is also checked
if (entry exists) {
accept the subscription.
}
else {
reject the subscription.
}
}
On publish is the same as the Candidate 1
What do you think?
I actually prefer candidate 2 now, because:
The control flow is simpler
The client immediatly gets a subscription denied if it is not authorized, instead of silently accepting the subscription and client is not receiving the published messages because it is not authorized
On publish you don't have to check subscriber authorization, this is already done on subscription. = easier = better performance.
Let me clarify what does the permission u1 : example/#
mean?
There are two interpretation.
example/#
(exact match string) to u1. But actual delivery is checked using individual permission like u1: example/a
. In this case, even if u1 subscribes example/#
, published message to the topic example/b
is not delivered to u1.example/#
(exact match string) to u1. And all matched publish topics are delivered to u1. This doesn't need authorization (read) checking on publish to deliver message.When I wrote https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-846384438, I assumed 1. So I wrote
On publish is the same as the Candidate 1
It contains on publish authorization (read) check for delivery.
I noticed that we didn't discuss negative list yet.
If you think 2 is better, how to describe treat negative list? For example,
u1 : example/# read (allow) u1 : example/b read (deny)
I'm not sure this negative entry (deny) is acceptable. If it is acceptable, the meaning is
example/#
. The published message example/b
is not delivered to u1example/b
.Perhaps negative entry describes like as follows
#
a
This means all topics in example/
denied, but topic filter example/#
and example/a
are allowed. In this case, published message example/b
doesn't deliver to u1.
Good question,
maybe have only two options:
Deny all, and have individual allows: default deny all allow read # allow read a
Allow all, and have individual deny: default allow all deny read # deny read a
Would that be a good option ?
Indeed this options becomes very ambigious:
u1 : example/# read (allow) u1 : example/b read (deny)
is example/b allowed or denied ? Is it dependent on order, or does deny override deny, or .. User will probably not fully understand or make mistakes.
Mosquitto seems to specify an order, but it is not so clear:
if this parameter is defined then only the topics listed will have access. Topic access is added with lines of the format:
topic [read|write|readwrite|deny]
The access type is controlled using "read", "write", "readwrite" or "deny". This parameter is optional (unless
Good question,
maybe have only two options:
- Deny all, and have individual allows: default deny all allow read # allow read a
- Allow all, and have individual deny: default allow all deny read # deny read a
Would that be a good option ?
I think that those are good options.
The point is
I think that it is reasonable rule. I said about mosquitto at https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-755138731 .
I'm not sure mosquitto ACL is good enough. What do you think ?
As you mentioned at https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-846404621 , mosquitto rule is ambiguous. To be honest, I don't fully understand it.
I think that it is better to define a clearer rule for mqtt_cpp broker. The rule might be incompatible to mosquitto in some points but it is acceptable.
We need to define the meaning of wildcard in the entry.
I think that the following rule is good.
allow
then accept the subscription. If the result is deny
then deny the subscription. (default should be considered)
allow
then accept the publish. If the result is deny
then deny the publish. (default should be considered)Let me show an the example.
default allow all
deny read a
The subscriber can subscribe #
because default allow all
. But it is just accepted only subscription.
The publisher publish to the topic a
. Publish itself is succeeded because default allow all. On delivery phase, the topic
ais matched to
#, but the published message is NOT delivered to the subscriber because
deny read a`.
The publisher publish to the topic b
. Publish itself is succeeded because default allow all. On delivery phase, the topic
bis matched to
#, the published message is delivered to the subscriber because
default allow alland no
deny read b` entry exists.
So, wildcard subscription checked twice. The first checking is on subscribe. The second checking is on publish (delivery phase).
What do you think?
Yes, you are right, but I think this is only for default allow.
If the default is deny: default deny all allow read a
Then no check at publish is needed ? Subscribe to # is denied, subscribe to a is allowed.
default deny all allow read a/#
Subscribe to # is denied, subscribe to a/# or a/b is allowed.
Yes, you are right, but I think this is only for default allow.
- Subscription needs to be checked if allowed/denied
- On publish, check needs to be done if a more specific rule allow/denies the message
If the default is deny: default deny all allow read a
Then no check at publish is needed ?
I think so if you mean on publish delivery phase checking.
Subscribe to # is denied, subscribe to a is allowed.
Yes.
default deny all allow read a/#
Subscribe to # is denied, subscribe to a/# or a/b is allowed.
Yes.
But a publisher publishes to the topic a/b
at the delivery phase, the message is NOT delivered.
In order to deliver the message allow read a/b
is also required.
By the way, default is for both read and write. I'm not sure but if read and write can set individually, it might be useful.
Something like as follows:
default read deny all default write allow all allow read a/b deny write a/b
Semantic is
default
read
is set, then the rule target is subscribe. If write
is set then the rule target is publish. If not set anything, rule targets are subscribe and publish.What do you think ?
Yes. But a publisher publishes to the topic
a/b
at the delivery phase, the message is NOT delivered. In order to deliver the messageallow read a/b
is also required.
I understand but i think that this is not very practical. If you have for example a multi tenant broker, a broker used by many companies, you might whitelist a path for a company. Say you want to give a company a share of the broker, you might allow them to publish and subscribe under company.com/#. But you do not know which topics they want to create
Yes. But a publisher publishes to the topic
a/b
at the delivery phase, the message is NOT delivered. In order to deliver the messageallow read a/b
is also required.I understand but i think that this is not very practical. If you have for example a multi tenant broker, a broker used by many companies, you might whitelist a path for a company. Say you want to give a company a share of the broker, you might allow them to publish and subscribe under company.com/#. But you do not know which topics they want to create
Good point. I understand that your idea defines wildcard in the rule as follows:
default deny all
allow read company.com/#
On subscribe,
company.com/#
in the rule table means applying the rule (allow or deny) to any topics filter under company.com/
including wildcard. So the following subscriptions are accepted.
company.com/#
company.com/a
company.com/b
company.com/+/c
In addition, there is no check on deliver phase.
Am I understanding correctly ?
Yes exactly
And for default rule, i would say explicit set for read and write. Not for both at same time. But is minor thing
Although
default deny all allow read company.com/# allow write company.com/#
Is pretty clear
I'm still not sure a little.
I think that default
is the same as #
.
default deny all allow read company.com/# allow write company.com/#
So the rule above is the same as
deny read # deny write # allow read company.com/# allow write company.com/#
Is this right?
Yes exactly. But default only set once and to #.
default deny allow read company.com/# allow write company.com/#
We (at least I) omitted Username in the rule intentionally but I assume that the rule is for one user.
My model is as follows:
Is this the same as you ?
Or, default
is more globally applied ?
Per username is flexible, but maybe a bit too flexible? Probably all users will get same default. But can be useful to configure per user?
If global default is deny all
default deny all
user root allow read # allow write #
Your model is as follows ?
I had an idea how to add topic authorization to the broker. I would like to propose this idea. To start with, you need a database of accounts with possible topic filters as follows:
USER1: Password1 topic: example/+/test, rights: publish
USER2: Password2 topic: example/+/test, rights: subscribe
USER3: Password3 topic: example/+/test, rights: publish + subscribe
So accounts get a list of users with passwords, and topic filters with rights if they are allowed to publish/subscribe to a topic.
Now, in the broker, when a connect enters: https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/broker/broker.hpp#L405-L450
Rather than calling 'connect_handler' directly: https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/broker/broker.hpp#L439-L448
You pass the connect request to some authorization class:
This will lookup the username/password in the databae (possibly a json file or some external authenticator). And finally forward the request to the connect handler within the broker with the rights:
The user rights are stored in a subscription map, such that we know for each session which rights apply (first set is list of sessions that are allowed to publish, second is list of sessions which are allowed to subscribe): using sub_rights_map = multiple_subscription_map<buffer, std::pair< std::set, std::set > >;
Now, when a message is published in
Lookup the topic_name in sub_rights_map. The publisher should have rights: Publish The sessions that receive the messages should have rights: Subscribe
The publisher should be somewhere in any of the filters which is allowed to publish to the topic.
You can lookup the set of subscribers by looking up the complete set of sessions which are allowed to subscribe to this topic std::set < session_state_ref> >, and then calculating the intersection with sessions which are actually subscribed to this topic.