Closed kleunen closed 2 years ago
Yes i think that is quite clear. The acl rules are clear
I have other question about #
in the authorization rule. Now, it has different meaning from MQTT wildcard. I guess that wildcard +
is not allowed in the authorization rule. I think that it is not so meaningful.
Is that right ?
If it is, #
could be misleading. I think that *
is better. It is just a syntax issue.
What do you think ?
No it is possible to check subscription with wildcard against auth rule with wildcard. The subscription has to be less specific or equal specific than the the auth rule. I will explain later, because this needs some examples.
You can check given an auth rule + subscription, if the subscription matches the authentication rules. For each token in the auth + subscription, you can check:
auth | subscription | authenticated |
---|---|---|
literal | literal | yes |
literal | + or # | no |
+ | literal or + | yes |
+ | # | no |
# | literal or + or # | yes |
so for example: auth: example/a, sub: example/a, auth: yes auth: example/a, sub: example/b, auth: no
auth: example/+/a, sub: example/a/a, auth: yes auth: example/+/a, sub: example/+/a, auth: yes auth: example/+/a, sub: example/#, auth: no
auth: example/#, sub: example/a, auth: yes auth: example/#, sub: example/+, auth: yes auth: example/#, sub: example/#, auth: yes
Thank you for explaining. I understand. It seems to be good rule. Let's continue the discussion based on this rule.
I wrote the authorization model https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-846553347 I noticed that it is Username based. Authorization is a relationship between topics (including Topic Filter) and Username.
I think that users want to make a rule that "any users can publish to the topic a/b", but Username frequent_publisher
should be denied.
What is a good way to explain it ?
default allow all
username frequent_publisher
deny write a/b
I think that the following situation seems to be practical and difficult to describe rule.
topic | user | subscribe |
---|---|---|
companyA.com/release/# | non_list_users | deny |
companyA.com/release/# | u1 | allow |
companyA.com/release/# | u2 | allow |
companyA.com/release/# | u3 | allow |
companyA.com/trial/# | non_list_users | allow |
companyA.com/trial/# | u7 | deny |
companyA.com/trial/# | u8 | deny |
CompanyA provide trial topic and relapse topic. Trial topic is for trial. Relapse topic for production. Trial topic is widely allowed for easy trial but some bad behavior users (e.g. too much publish) are denied (negative listed). Production topic requires individual permission because of security.
I think that it should be able to be expressed by our rule but I think that it is not possible. ( There are many users.)
Another similar case. CompanyA has different default policy from CompanyB.
topic | user | subscribe |
---|---|---|
companyA.com/# | non_list_users | deny |
companyA.com/# | u1 | allow |
companyA.com/# | u2 | allow |
companyA.com/# | u3 | allow |
companyB.com/# | non_list_users | allow |
companyB.com/# | u7 | deny |
companyB.com/# | u8 | deny |
Maybe have groups of ours and configure a default for a group of users ?
CompanyA a different group of users and settings as CompanyB
I got some ideas.
companyA
and companyB
example, organization could have different default policy. And organization is usually structured. So companyA/division1
and companyA/division2
could have different policy. I think that recursive rule should be supported.Rule structuring idea:
#
allow (means global default)companyA/#
deny (means companyA default)companyA/division1/#
allow (means companyA/division1 default)companyA/division1/t1
deny
companyA/division1/t2
deny
companyA/division2/#
deny (means companyA/division2 default)companyA/division2/t1
allow
companyA/division2/t2
allow
companyB/#
allow (means companyB default)companyB/division1/t1
deny (omit companyB/division1/#
means inherit one layer upper default)
Result example: | topic\user | u1 | u2 | u3 | u4 |
---|---|---|---|---|---|
companyA/division1/t1 | d | d | a | a | |
companyA/division1/t2 | a | d | d | a | |
companyA/division1/t3 | a | a | a | a | |
companyA/division2/t1 | a | a | d | d | |
companyA/division2/t2 | d | a | a | d | |
companyA/division2/t3 | d | d | d | d | |
companyA/division3/t1 | d | d | d | d | |
companyB/division1/t1 | a | d | d | a | |
companyB/division1/t2 | a | a | a | a | |
companyB/division2/t1 | a | a | a | a | |
companyC/division1/t1 | a | a | a | a |
a
means allow, d
means deny.
Note: u4, t3, division3, and companyC are not in the rule tree. They are written to check default rule.
I use the topic ends with #
as default rule. It can't have users.
I think that it is simple and flexible.
But users might be expanded as follows if needed:
#
allow (means global default)
companyA/#
deny (means companyA default)
companyA/division1/#
allow (means companyA/division1 default)
companyA/division1/t1
deny
companyA/division1/t2
deny
companyA/division2/#
deny (means companyA/division2 default)companyA/division2/t1
allow
companyA/division2/t2
allow
companyB/#
allow (means companyB default)
companyB/division1/t1
deny (omit companyB/division1/#
means inherit one layer upper default)
users: *
means for all users. If this is omitted, the rule assume users: *
.
The rule parsed top to bottom, if conflicted rule entry is appeared, then new one overwrites the old one. Maybe outputting warning message is helpful.
Rule writers can write the rules like a branching tree.
It is possible but it sounds a bit complicated to configure and also to use. But I guess in practice the configuration will not be very complicated, in practive only a few rules will be configured.
With the subscription map you should be able to find all rules which apply when you publish a specific topic:
When you have rules: companyA/division2/# companyA/division2/t1
and I publish companyA/division2/t1, the subscription map will match 'companyA/division2/t1' and 'companyA/division2/#'. And you need to know which has priority over which. I would say the more specific rule has priority over the less specific rule.
When you have rules: companyA/division2/# companyA/division2/t1
and I publish companyA/division2/t1, the subscription map will match 'companyA/division2/t1' and 'companyA/division2/#'. And you need to know which has priority over which. I would say the more specific rule has priority over the less specific rule.
Yes, more specific one should has higher priority.
I forgot to add read/writ to my list. I mean the rule is for read.
Just I updated it. The combination of topic and user are not changed.
#
allow (means global default)companyA/#
deny (means companyA default)companyA/division1/#
allow (means companyA/division1 default)companyA/division1/t1
deny
companyA/division1/t2
deny
companyA/division2/#
deny (means companyA/division2 default)companyA/division2/t1
allow
companyA/division2/t2
allow
companyB/#
allow (means companyB default)companyB/division1/t1
deny (omit companyB/division1/#
means inherit one layer upper default)
#
allow (means global default)companyA/division2/# companyA/division2/t1
Back to your case, u1
can't subscribe companyA/division2/#
but can subscribe companyA/division2/t1
. So only companyA/division2/t1
is added to the subscription map. When u9
publishes companyA/division2/t1
, then companyA/division2/t1
is matched and the message is delivered to u1
.
Another case, u1
can subscribe companyA/division1/#
but can't subscribe companyA/division1/t1
.
Let's say, u1
has subscribed companyA/division1/#
now.
What happens when u9
publishes companyA/division1/t1
?
We can define two meanings.
read
rule is only for subscription. It is not related to deliver. In other words, authorization and wildcard matching are independent concept. In this case, published message is delivered to u1
. No delivery time checking required. read
rule is not only for subscription but also for delivery. In this case, published message is NOT delivered to u1
.I think that 2 might difficult to implement or might need high cost checking logic (I'm not sure). In this case 1 is a little bit surprising behavior but acceptable (We need to document authorization and wildcard matching are independent concept)
If 2 can be implemented by practical cost, 2 is better.
It is possible but it sounds a bit complicated to configure and also to use. But I guess in practice the configuration will not be very complicated, in practive only a few rules will be configured.
I forgot to answer the comment above. Ordinary users use a small part of this rule. Like as follows:
#
deny
#
deny
I think that it is simple enough.
But it is worth to have the rule capability that can be explain complicated case. I'd like to find the rule that has both easy to write for ordinary users and high capability. I believe that my idea achieves both.
Another case,
u1
can subscribecompanyA/division1/#
but can't subscribecompanyA/division1/t1
. Let's say,u1
has subscribedcompanyA/division1/#
now. What happens whenu9
publishescompanyA/division1/t1
?We can define two meanings.
read
rule is only for subscription. It is not related to deliver. In other words, authorization and wildcard matching are independent concept. In this case, published message is delivered tou1
. No delivery time checking required.read
rule is not only for subscription but also for delivery. In this case, published message is NOT delivered tou1
.I think that 2 might difficult to implement or might need high cost checking logic (I'm not sure). In this case 1 is a little bit surprising behavior but acceptable (We need to document authorization and wildcard matching are independent concept)
If 2 can be implemented by practical cost, 2 is better.
I noticed that 1 is bad. Only 2 is acceptable.
If u1
subscribes #
then all messages are delivered to u1
. It is bad.
Sorry for the mix up.
Authentication
#
deny read write
?
Hmm. In my IoT service developing experience, a user requires both read and write permission to one topic is rare.
Let's say there are sensor, actuator, and controller. The sensor reports some status. The controller send a request to the actuator based on the reported sensor's status. It is one of typical scenario.
Usernames are sensor1, actuator1, and controller1.
There are topics as follows.
iot_app/sensors/sensor1_status
iot_app/sensors/actuator1_request
sensor1
can publish iot_app/sensors/sensor1_status
.
controller1
can subscribe iot_app/sensors/sensor1_status
.
controller1
can publish iot_app/sensors/actuator1_request
.
actuator1
can subscribe iot_app/sensors/actuator1_request
.
The minimal authorization rule is as follows:
I wrote Authentication in the comment above. It should be Authorization. I sometimes got confused.
#
denyiot_app/sensors/sensor1_status
allowcontroller1
iot_app/sensors/actuator1_request
allowactuator1
#
denyiot_app/sensors/sensor1_status
allowsensor1
iot_app/sensors/actuator1_request
allowcontroller1
Note
- `#` deny
is the short form of
- `#` deny
- users: *
There is no read/write allowed user in the one topic.
However, https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-850870655 's advantage is compact notation.
#
denyiot_app/sensors/sensor1_status
allow
controller1
sensor1
iot_app/sensors/actuator1_request
allow
actuator1
controller1
Note
- `#` deny
is the short form of
- `#` deny
- read *
- write *
If read
and/or write
is omitted, then regard it as *
(all users).
For example,
#
deny // deny all users read and write
iot_app/sensors/sensor1_status
allow // allow read controller1
and controller2
, allow write all users.
controller1
, controller2
iot_app/sensors/actuator1_request
allow // allow all users read and writeWhat do you think of the last version (refined notation based on your comment (read/write mixed)) ?
Yes, this looks good.
But I would make the default: If read and/or write is omitted, then regard it as nobody (no users). So make it explicit you want to allow everybody, from security perspective.
And it would be useful to combine users into groups
so you can say: group controller_group controller1 controller1 allow controller_group
Yes, this looks good.
But I would make the default: If read and/or write is omitted, then regard it as nobody (no users). So make it explicit you want to allow everybody, from security perspective.
Ok, I agree. Let me clarify.
Let's say *
is all users and -
is no users.
#
denyis the same as
#
denyBecause deny with omitting read/write should be for all users.
#
allowis the same as
#
allow-
-
Because allow with omitting read/write should be for no users.
Observation. Default meaning is inconsistent.
#
denyread and write are mandatory. If omitted, then format error is reported.
I think that disallow omitting approach is simpler. What do you think?
Maybe not simpler, because user always has to add read and write. But at least it is clear who is allowed. Maybe give in error message if read and write is missing explanation of how to allow all, allow nobody or allow username.
Maybe not simpler, because user always has to add read and write. But at least it is clear who is allowed. Maybe give in error message if read and write is missing explanation of how to allow all, allow nobody or allow username.
Indeed.
What do you think about Allow omitting rule? deny with omitting means for all users. allow with omitting means for no users. Do you agree this rule ?
And it would be useful to combine users into groups
so you can say: group controller_group controller1 controller1 allow controller_group
I agree to introducing user group concept.
One user can be a member of multiple groups. I considered the following cases.
#
denytopic1
allowtrial/#
allowtrial/topic2
denymessy/#
allowmessy/topic3
denyAnd the permission should be as follows:
u1 can read/write topic1
.
u2 can read/write topic1
.
u3 can't read/write topic1
.
u1 can't read/write trial/topic2
.
u2 can't read/write trial/topic2
.
u3 can read/write trial/topic2
.
u1 can read messy/topic3
.
u2 can't read messy/topic3
.
u3 can't read messy/topic3
.
u1 can't write messy/topic3
.
u2 can't write messy/topic3
.
u3 can write messy/topic3
.
Do you think so?
Yes i agree with allow ommiting rule. If default is nobody for allow and everybody for deny. This is ok too. It is also a good approach.
You can block an offending topic easily. Just say deny and nobody can read and write. Can be convenient.
Maybe have ommiting for deny: everbody is denied.
But give error if ommited on allow: please specify who is allowed. All or specific user or group of users.
And it would be useful to combine users into groups so you can say: group controller_group controller1 controller1 allow controller_group
I agree to introducing user group concept.
One user can be a member of multiple groups. I considered the following cases.
user_group
group1
users: u1, u2
group2
users: u2, u3
Authorization
#
denyread *
write *
topic1
allowread group1
write group1
trial/#
allowread *
write *
trial/topic2
denyread group1
write group1
messy/#
allowread group1
write group2
messy/topic3
denyread group2
write group1
And the permission should be as follows:
u1 can read/write
topic1
. u2 can read/writetopic1
. u3 can't read/writetopic1
.u1 can't read/write
trial/topic2
. u2 can't read/writetrial/topic2
. u3 can read/writetrial/topic2
.u1 can read
messy/topic3
. u2 can't readmessy/topic3
. u3 can't readmessy/topic3
. u1 can't writemessy/topic3
. u2 can't writemessy/topic3
. u3 can writemessy/topic3
.Do you think so?
Yes. But maybe group name always prefixed with @?
@group1 @group2 u1 u2
But if group is allowed and user is denied? User rule gets priority over group rule?
Yes. But maybe group name always prefixed with @? @Group1 @Group2 u1 u2
Prefix is a good idea. MQTT spec allows any UTF-8 string for Username. https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901071
So Username might starts with @
.
Maybe a requirement mqtt_cpp broker client must not use a Username starts with
@`.
But if group is allowed and user is denied? User rule gets priority over group rule?
I think that it is a similar situation, u2
is member of group1
and group2
. And group1
allowed and group2
denied.
But it couldn't happen at the same topic.
#
denysub1/#
allow // read group1 write no users(because of omittnig)
sub1/topic1
deny
In this case u1 can subscribe sub1/#
, sub1/any_topics_except_topic1
, but can't subscribe sub1/topic1
.
When u1 has subscribed sub1/#
, then the publish sub1/topic1
happen, the message is NOT delivered to u1 as we discussed.
There is no additional rule and confliction that is introduced by group concept.
Another case:
#
denytopic1
deny
topic1
allow
I think that it should be format error. The same topic entry should appear once. If it appears twice or more, it should be error.
#
deny
topic1
denyread @Group1
topic1
allowread @Group2
And user u1 is part of Group1 and Group2 ?
I think user should only be part of 1 group.
This is format error.
I assume that the parsing process is top to bottom.
#
deny
topic1
deny // maybe output warning. explicit deny in deny all users is redundant
topic1
allow // error topic1
has already been appeared. Should appear only once.
I added comments.
What about:
#
deny
/sub/topic1
deny
/sub/#
allow
more specific topic should follow broader topics ? so should be ?
#
deny
/sub/#
allow
/sub/topic1
deny
What about:
#
deny
/sub/topic1
denyread @Group1
/sub/#
allowread @Group2
?
It is the same as follows:
#
deny
*
*
/sub/topic1
deny
/sub/#
allow
#
deny
*
*
/sub/topic1
deny // redundant because #
is denied. Let's say rule1
/sub/#
allow // Two possible options see below. Lets's say rule2
Two possible options.
After sorted:
#
deny
*
*
/sub/#
allow
/sub/topic1
deny
username | subscribe sub/# | subscribe sub/topic1 | deliver (sub/topic1) |
---|---|---|---|
u1 | no | no | no |
u2 | yes | no | no |
u3 | no | yes | yes |
more specific topic should follow broader topics ? so should be ?
"more specific" is the same meaning as "sort wide to narrow" I commented.
I recommend writing the rule this order. But I think that it can be sorted by the broker. If it can, sort is kinder implementation. And possibly, output warning message if users rule is not sorted. If it is difficult to implement, output error message and finish broker due to invalid rule format. It is acceptable option.
Yes a warning should be generated if rules are applied in different order
Maybe I edited the comment https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-850982726 after you read. Please check it again :)
I think it is ok like this.
Thank you! I think that the semantics are fixed.
The next step is syntax. I wrote JSON example:
{
"authentication": [
{
"name": "u1",
"method": "password",
"password": "mypassword"
},
{
"name": "u2",
"method": "client_cert"
}
],
"group": [
{
"@g1" : ["u1", "u2"]
}
],
"authorization": [
{
"topic": "#",
"type": "deny"
# "pub": ["*"] # can omit
# "sub": ["*"] # can omit
},
{
"topic": "sub/#",
"type": "allow",
"sub": ["@g1"]
},
{
"topic": "sub/topic1",
"type": "deny",
"sub": ["u1"]
},
]
}
I choose the word "sub/pub" instead of "read/write" because they are MQTT words.
What do you think?
I think text based rules with tabs is better.
But you can support both
topic /topic1 allow
read: u1
I think text based rules with tabs is better.
JSON is a little bit redundant. So simpler notation is nice. By the way, "tabs" means indent ? I personally don't like TAB character. Indent is good,
If we support JSON and ini file format, you can use boost property tree. If you want to support other (original) text format, then you need to use Boost.Spirit (or X3). X3 is more sophisticated but experimental (it actually works).
Or do you know any good library to parse indented text ?
Yes indent is spaces or tabs.
I do not know any parser. Maybe there if a yaml parser based on spirit? https://en.m.wikipedia.org/wiki/YAML
Ok, I think that Boost.Spirit.X3 is good one to write parser.
https://www.boost.org/doc/libs/1_76_0/libs/spirit/doc/x3/html/index.html
https://github.com/msgpack/msgpack/blob/master/spec.md https://github.com/msgpack/msgpack-c/blob/cpp_master/include/msgpack/v2/x3_parse.hpp
I think that writing PoC code to parse indented text. It outputs C++ data structure.
You can also parse indented text to property_tree, that way you only have to handle property tree when reading configuration. And also be able to input ini and json format.
Which do you mean Pattern A or Pattern B ?
If you mean B, I think that C is better.
If you mean A, we need to check property_tree has enough accessing method.
mqtt_cpp_some_data_structure
might be multi_index. It can provide flexible access.
I guess that the data structure needs to have flexible accessing methods if we implement on runtime update in the future.
ini -----------------------+
|
V
json-----------------> property_tree ---> broker
A
|
indented text -------> spirit x3
ini -----------------------+
|
V
json-----------------> property_tree ---> mqtt_cpp_some_data_structure ---> broker
A
|
indented text -------> spirit x3
ini -----------------------+
|
V
json-----------------> property_tree ---> mqtt_cpp_some_data_structure ---> broker
A
|
indented text -------> spirit x3 -------------------+
Pattern B.
mqtt_cpp_some_data_structure will be a subscription_map probably, and some combination of datastructures, possibly. You do want to optimize the checking of rules when user logs in.
std::map<username, userinfo>
std::map<groupname, std::set
something like this.
I think that spirit x3's semantic action adds parse result to property_tree (Pattern B) or mqtt_cpp_some_data_structure (Pattern C) repeatedly. I think that it Pattern C is simpler and straight forward approach. I'm not sure but I guess that property_tree is designed for parser and element accessor. In pattern B, property_tree is used as container. Maybe insert some element to the property_tree in the semantic action. It is a little weird for me.
Just have a look what is easiest. Maybe first define the internal datastructures for fast authentication and rule matching.
Ok. By the way, I guess that sorting by wide to narrow will be implemented in mqtt_cpp_some_data_structure. At least property_tree doesn't have such functionality.
Maybe first define the internal datastructures for fast authentication and rule matching.
Yes, I think that it is a good way.
Have you made any progress on the authorization ? or not working on mqtt_cpp ?
I'm working on my company's broker and SDKs. So unfortunately, I don't have much time for mqtt_cpp. I think that the spec of authentication and authorization is almost fixed. At least we have the agreement for the controversy part. So I think that you can start implementing them. PR is welcome :)
I was just a bit worried, you haven't committed anything since may. I was thinking: i hope nothing bad happened to you. But luckily you are just busy.
Yes, maybe after my holiday, I may pickup some work again on mqtt_cpp. I have been runner the broker for quite a while now, it is completely stable. Although only used sometimes.
Sorry for making you worry.
This is one of my recent activity on github. The logic is from mqtt_cpp. (The PR itself created some time ago.) https://github.com/mqttjs/MQTT.js/pull/1243#issuecomment-865393719 This is a related work of my company's (extended) MQTT SDKs.
I had an idea how to add topic authorization to the broker. I would like to propose this idea. To start with, you need a database of accounts with possible topic filters as follows:
USER1: Password1 topic: example/+/test, rights: publish
USER2: Password2 topic: example/+/test, rights: subscribe
USER3: Password3 topic: example/+/test, rights: publish + subscribe
So accounts get a list of users with passwords, and topic filters with rights if they are allowed to publish/subscribe to a topic.
Now, in the broker, when a connect enters: https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/broker/broker.hpp#L405-L450
Rather than calling 'connect_handler' directly: https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/broker/broker.hpp#L439-L448
You pass the connect request to some authorization class:
This will lookup the username/password in the databae (possibly a json file or some external authenticator). And finally forward the request to the connect handler within the broker with the rights:
The user rights are stored in a subscription map, such that we know for each session which rights apply (first set is list of sessions that are allowed to publish, second is list of sessions which are allowed to subscribe): using sub_rights_map = multiple_subscription_map<buffer, std::pair< std::set, std::set > >;
Now, when a message is published in
Lookup the topic_name in sub_rights_map. The publisher should have rights: Publish The sessions that receive the messages should have rights: Subscribe
The publisher should be somewhere in any of the filters which is allowed to publish to the topic.
You can lookup the set of subscribers by looking up the complete set of sessions which are allowed to subscribe to this topic std::set < session_state_ref> >, and then calculating the intersection with sessions which are actually subscribed to this topic.