redboltz / mqtt_cpp

Boost Software License 1.0
437 stars 107 forks source link

Broker authorization proposal #779

Closed kleunen closed 2 years ago

kleunen commented 3 years ago

I had an idea how to add topic authorization to the broker. I would like to propose this idea. To start with, you need a database of accounts with possible topic filters as follows:

USER1: Password1 topic: example/+/test, rights: publish

USER2: Password2 topic: example/+/test, rights: subscribe

USER3: Password3 topic: example/+/test, rights: publish + subscribe

So accounts get a list of users with passwords, and topic filters with rights if they are allowed to publish/subscribe to a topic.

Now, in the broker, when a connect enters: https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/broker/broker.hpp#L405-L450

Rather than calling 'connect_handler' directly: https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/broker/broker.hpp#L439-L448

You pass the connect request to some authorization class:

class AuthorizerInterface {
  virtual void authorize_connect(
        con_sp_t spep,
        buffer client_id,
        optional<buffer> /*username*/,
        optional<buffer> /*password*/,
        optional<will> will,
        bool clean_start,
        std::uint16_t /*keep_alive*/,
        v5::properties props
    )
}

This will lookup the username/password in the databae (possibly a json file or some external authenticator). And finally forward the request to the connect handler within the broker with the rights:

  bool connect_handler(
        con_sp_t spep,
        buffer client_id,
        optional<buffer> /*username*/,
        optional<buffer> /*password*/,
        optional<will> will,
        bool clean_start,
        std::uint16_t /*keep_alive*/,
        v5::properties props, 
        std::vector< MQTT_NS::buffer > user_publish_filters,
        std::vector< MQTT_NS::buffer > user_subscribe_filters,
    )

The user rights are stored in a subscription map, such that we know for each session which rights apply (first set is list of sessions that are allowed to publish, second is list of sessions which are allowed to subscribe): using sub_rights_map = multiple_subscription_map<buffer, std::pair< std::set, std::set > >;

Now, when a message is published in

bool publish_handler(
        con_sp_t spep,
        optional<packet_id_t> packet_id,
        publish_options pubopts,
        buffer topic_name,
        buffer contents,
        v5::properties props) {

Lookup the topic_name in sub_rights_map. The publisher should have rights: Publish The sessions that receive the messages should have rights: Subscribe

The publisher should be somewhere in any of the filters which is allowed to publish to the topic.

You can lookup the set of subscribers by looking up the complete set of sessions which are allowed to subscribe to this topic std::set < session_state_ref> >, and then calculating the intersection with sessions which are actually subscribed to this topic.

ineffective commented 3 years ago

First, sorry for not reading everything in this thread. Second - I read your ruminations about reg. certificates. Could you consider adding some way to extract different field than CNAME? We use UID (oid: 0.9.2342.19200300.100.1.1) to store client UUID and this is what we use for authentication - we currently use ActiveMQ with custom plugin, but we consider switching to custom broker (mqtt_cpp based of course) and if this option was there it would be really great.

kleunen commented 3 years ago

I guess that would be good to take into consideration when designing an authentication system. If i understand correctly you have a custom authentication plugin for activemq?

ineffective commented 3 years ago

Yes, we have custom plugin for activemq. For mqtt connections it simply takes remote certificate used for authentication, extracts UID and checks if client_id is set to the same value. If yes then uses data from certificate for authorization, otherwise connection is dropped. It would be nice if same thing could be done using mqtt_cpp broker. On the client side (Debian 9.6 /w openssl-1.1.0l-1~deb9u3) UID extraction by oid is done using this (maybe a little clumsy, but at least well commented IMHO) code:

std::vector<std::byte> get_tag_from_cert(std::string const& file, std::string const& tag_string) {
    // open certificate file
    FILE* fl = fopen(file.c_str(), "rb");
    if (fl != nullptr)
    {
        // if file was opened succesfully, create shared pointer that will automatically close it after it goes out of scope
        boost::shared_ptr<FILE> f(fl, ::fclose);
        // generate OID (object id) for requted tag (parse tag_string)
        ASN1_OBJECT* obj = OBJ_txt2obj(tag_string.c_str(), 1);
        // create certificate object from data read from file
        boost::shared_ptr<X509> x509(PEM_read_X509(f.get(), NULL, NULL, NULL), X509_free);
        // get subject name from the certificate. no need for separate deletion, this is only a pointer to value stored in x509 object
        auto sn = X509_get_subject_name(x509.get());
        // array for UUID read from subject that we got above
        char arr[128] = { 0 };
        // read text value of requested object (obj) in subject name (sn), store in arr (text buffer)
        auto rv = X509_NAME_get_text_by_OBJ(sn, obj, arr, sizeof(arr));
        // check for any errors
        if (rv < 0 || (rv > 0 && static_cast<size_t>(rv) >= sizeof(arr))) {
            // ... and throw if there are any
            throw std::runtime_error("can't get tag [" + tag_string + "] from file [" + file + "], rv: " + std::to_string(rv));
        } else {
            // prepare return value, namely create preallocated vector
            std::vector<std::byte> ret_val{size_t(rv)};
            // copy bytes from arr buffer to return vector (ret_val)
            std::transform(arr, arr + size_t(rv), ret_val.begin(), [](char c) { return std::byte(c); });
            // terminate with NUL byte
            ret_val.push_back(std::byte(0));
            // and return
            return ret_val;
        }
    } else {
        // ... or throw an exception if file couldn't be opened
        throw std::runtime_error("can't open cert file: " + file);
    }
    // automatic storage reclamation will take care of closing file and freeing memory used by the certificate
}
redboltz commented 3 years ago

CNAME is commonly used. I don't know what UID mean. Is it common field of client certificate?

client_id is different concept, please read all thread. CNAME (or UID) is for Username. And TLS authentication is same effect as password. So we don't compare client_id. It is for multiple sessions.

Authentication method Username Unique Session Identifier
MQTT Username Password Username Username+ClientId(Username is upper)
TLS Client cert CNAME(or other signed field) as Username Username+ClientId(Username is upper)

This is a concept model. Based on this model, other CNAME like field can be used (optional). By the way, CNAME is gotten by NID_commonName as https://github.com/redboltz/mqtt_cpp/blob/bed6d658ad3502caebc532c01ac397553f78ef7e/example/tls_both_client_cert.cpp#L362

ineffective commented 3 years ago

CNAME is commonly used. I don't know what UID mean. Is it common field of client certificate?

No, it is not common, but it is defined (has OID assigned) and is part of subject name. You can see it for example here: http://oid-info.com/get/0.9.2342.19200300.100.1.1 and it is also mentioned here: https://www.cryptosys.net/pki/manpki/pki_distnames.html

client_id is different concept, please read all thread.

I'm trying, but it is very long. As far as I understand client_id can be set to basically any value by the connecting side - this is what we want to prevent by checking if this value is equal to UID, because otherwise there is a collision. Note that this is custom plugin for ActiveMQ, not something we get by default. Problem that it is trying to solve is similar (if I understand it correctly) to what @kleunen wrote in one of previous posts:

Also, what is missing is to limit the pattern of the client id for a user. For example, if user 'roger' logs in, he can only select a client id starting with 'roger_'. This prevents users from having a client_id collision.

In fact I this is exactly what we avoid using this plugin.

CNAME (or UID) is for Username. And TLS authentication is same effect as password.

cname is common name. This is logically different from uid (so user id). In our use-case cname contains completely different set of data (parse-able string that contains information like hostname, device type and other - it could be done differently, but it was done this way a long time ago).

So we don't compare client_id. It is for multiple sessions. Authentication method Username Unique Session Identifier MQTT Username Password Username Username+ClientId(Username is upper) TLS Client cert CNAME(or other signed field) as Username Username+ClientId(Username is upper)

I understand that in our case "other signed field" can be uid that I mentioned before. This is exactly what we want. The question is whether this field can be specified using configuration file. If yes, then quite probably NID is not the best choice. As far as I understand NID is numeric identifier which is assigned to specific OID and it is openssl specific. I'm not sure, but I believe there are no guarantees that NID won't change between library releases.

I'm slowly reading this thread and try to decide if this is something we will be able to use out of the box. Quite probably not and quite a lot of programming will be required anyway, so maybe I'm just wasting your precious time. Sorry.

redboltz commented 3 years ago

Also, what is missing is to limit the pattern of the client id for a user. For example, if user 'roger' logs in, he can only select a client id starting with 'roger_'. This prevents users from having a client_id collision.

In fact I this is exactly what we avoid using this plugin.

So we use Username * ClientID approach. Strictly speaking, it could be violate MQTT specification. But we do it because it is practical approach.

CNAME (or UID) is for Username. And TLS authentication is same effect as password.

cname is common name. This is logically different from uid (so user id). In our use-case cname contains completely different set of data (parse-able string that contains information like hostname, device type and other - it could be done differently, but it was done this way a long time ago).

So we don't compare client_id. It is for multiple sessions. Authentication method Username Unique Session Identifier MQTT Username Password Username Username+ClientId(Username is upper) TLS Client cert CNAME(or other signed field) as Username Username+ClientId(Username is upper)

I understand that in our case "other signed field" can be uid that I mentioned before. This is exactly what we want. The question is whether this field can be specified using configuration file. If yes, then quite probably NID is not the best choice. As far as I understand NID is numeric identifier which is assigned to specific OID and it is openssl specific. I'm not sure, but I believe there are no guarantees that NID won't change between library releases.

I'm slowly reading this thread and try to decide if this is something we will be able to use out of the box. Quite probably not and quite a lot of programming will be required anyway, so maybe I'm just wasting your precious time. Sorry.

I personally think that Username Password authentication is a good start point, then authorization, and finally, Client Certificate authentication. For now, mqtt_cpp broker doesn't have any authentication authorization ocde. I don't know much about OpenSSL API (it is very difficult to use for me) . I use NID based field search in the example code but replace it with object? based approach is not an essential problem. We can implement it using flexible approach. If object based selector is the best approach, then we can use it. In order to switch authentication method, the broker can provide customization point. This requires recompile. Maybe some authentication selector (that including cert field selector) using boost program options is better.

kleunen commented 3 years ago

The thread is very long, many different topics are involved and many possibilities where discussed. To go forward, i think some of the different topics need to be combined into different proposals. These can be discussed separately. I was involved in this discussion, but due to available time, i lost track a bit as well. Possibly i will have some time soon to work on the broker again.

I do have experience with openssl/ssl based authentication.

kleunen commented 3 years ago

I'm slowly reading this thread and try to decide if this is something we will be able to use out of the box. Quite probably not and quite a lot of programming will be required anyway, so maybe I'm just wasting your precious time. Sorry.

At the moment nothing of authentication and access control is implemented. People work on this project in spare time. Authentication/access control is a big topic and PR accepted in this project only get accepted if quality is very high + tested. So be aware most likely implementing this feature will take significant time. Project focus is also on building a client, not really a broker.

redboltz commented 3 years ago

Actually the thread is very long. Let me show you guideline.

Authentication

Authentication is permission for connection. It is checked on CONNECT.

Authentication method Username Unique Session Identifier
MQTT Username Password Username Username+ClientId(Username is upper)
TLS Client cert CNAME(or other signed field) as Username Username+ClientId(Username is upper)

Important note

MQTT spec requires Client Identifier(ClientId) must be unique on one broker. mqtt_cpp broker intentionally violate it. On mqtt_cpp Username + ClientId must be unique. Separator is not decided yet. For example, let's say separator is - .

The following two connection can exist on the same time:

Username: User1, ClientId: Id1 Username: User2, ClientId: Id1

It is treat as User1-Id1 and User2-Id1 internally.

The following two connections share the same authentication. That means the same Username multiple connections can exist if ClientId is different. Username: User1, ClientId: Id1 Username: User1, ClientId: Id2

How to treat the empty ClientId is not completely decided yet. https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901059

Authorization

Authorization is a permission between Username and Topic Filter. That means User1-Id1 and User1-Id2 have the same permission. It is checked on PUBLISH, SUBSCRIBE, and deliver.

On deliver checking is important to treat well Wildcard Topic Filter.

Read 10 comments from https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-850970590 to https://github.com/redboltz/mqtt_cpp/issues/779#issuecomment-850982048

kleunen commented 3 years ago

Following example loads the json config using property_tree:

https://wandbox.org/permlink/iJmuEhGC3txDK02v

The group format was changed, because with the old format, you can define multiple groups within the same object:

    "group": [{
        "name": "@g1",
        "members": ["u1", "u2"]
    }],
redboltz commented 3 years ago

Wandbox supports multiple files. The left most tab is file for int main() {}. You can add any files the right of the first file. https://wandbox.org/permlink/smj7eCPjxcYJhVBN This demonstrates data.json.

Could you update your wandbox code?

kleunen commented 3 years ago

https://wandbox.org/permlink/L51ENY8SLD2EF9rD

I think the password should not be stored plaintext, it should be salt+hash. So one should also be able to configure the salt.

redboltz commented 3 years ago

The group format was changed, because with the old format, you can define multiple groups within the same object:

  "group": [{
      "name": "@g1",
      "members": ["u1", "u2"]
  }],

What does it mean? I don't understand the difference.

The new format can define multiple groups as follows:

    "group": [
        {
            "name": "@g1",
            "members": ["u1", "u2"]
        },
        {
            "name": "@g2",
            "members": ["u2", "u3"]
        }
    ],

I'm not sure what it the old format. Could you show me the new and old example?

kleunen commented 3 years ago

New format:

    "group": [
        {
            "name": "@g1",
            "members": ["u1", "u2"]
        },
        {
            "name": "@g2",
            "members": ["u2", "u3"]
        }
    ],

Old format:


    "group": [
        {
            "@g1": ["u1", "u2"],        
            "@g2": ["u2", "u3"]
        },
        {
            "@g3": ["u2", "u3"]
        }
    ],
redboltz commented 3 years ago

I think the password should not be stored plaintext, it should be salt+hash. So one should also be able to configure the salt.

Good point. In my proprietary broker, I use digest, salt, algorithm (for digest), parameter (for algorithm, optional).

algorithm is the name of digesting algorithm. e.g.) sha-256

redboltz commented 3 years ago

I just glanced the code. It seems that the structures (Authentication etc) are directly reflected the json format. It is to store the json.

What is BrokerSecurity's responsibility ? I guess that it would be a member of broker. Let's say broker has a menber variable BrokerSecurity bs;. Maybe something like as follows ?

I think that those member functions would be implemented using authentication, authorization, and groups.

BTW, I think that the code is PoC. So it is not satisfied https://github.com/redboltz/mqtt_cpp/wiki/Coding-Rules . It's no problem, so far. In the actual code (PR), please use the Coding-Rules.

kleunen commented 3 years ago

Yes. It is just to try out and think about some things. Like do you want to make hash configurable or just sha256 for password. And you want to allow anonymous login? And how? anonymous is a special user?

redboltz commented 3 years ago

Yes. It is just to try out and think about some things. Like do you want to make hash configurable or just sha256 for password. And you want to allow anonymous login? And how? anonymous is a special user?

Configurable hash is good. So far, sha-256 is good enough. But I want to have a customization point in the json format.

For anonymous login, how about this? (Just idea)

anonymous like authentication

    "authentication": [
        {
            // no username and no password accepted
        },
        {
            // empty string username with no password accepted
            "name": ""
        },
        {
            // no password accepted
            "name": "u3" 
        },
        {
            // no username with password accepted
            "method": "password",
            "password": "mypassword"
        },
        {
            // empty string username with password accepted
            "name": "",
            "method": "password",
            "password": "mypassword"
        }
    ]
kleunen commented 3 years ago

Making hash configurable is good I would say, maybe default to sha256 indeed.

I was thinking something like:

  "authentication": [
        {
            "name": "anonymous",
            "method": "anonymous"
        }
    ]

or possibly:

  "config": [
        {
            "hash": "sha-256",
            "salt": "1234", 
            "anonymous":"anonymous"
        }
    ]

You define for a user the method is anonymous, basicly specifying what the anonymous username should be. (Only one anonymous user allowed) .

kleunen commented 3 years ago

What is BrokerSecurity's responsibility ? I guess that it would be a member of broker. Let's say broker has a menber variable BrokerSecurity bs;. Maybe something like as follows ?

  • bs.check_user(username, params) for connect.
  • bs.can_publish(username, topic_filter) for publish.
  • bs.can_subscribe(username, topic_name) for subscribe and delivery.

I think that those member functions would be implemented using authentication, authorization, and groups.

Yes, something like this. But the authorization filters should be stored in a subscription_map, for fast lookup.

redboltz commented 3 years ago

Sorry for my late replay. I need to implement automatic client_id generation and request response topic. It could be related to the issue.

When the client CONNECT using empty client_id, then the broker generates unique client_id (using uuid, so far). When the client CONNECT with RequestResponseInformation(1) properry, then the broke generates unique TopicName (using uuid, so far) and send CONNACK with ResponseTopic property.

Maybe auto generated client_id is not related to authentication and authorization because authentication is based on UserName not clinet_id. Auto generated TopicName for ResponseTopic is related to authorization. I think that the ResponseTopics are tagged something like ResponseTopicTag. ResponseTopic should only be able to subscribe by the client that sends RequestResponseInformation. ResponseTopic should be able to publish by all clients.

client(request sender)                             broker                   client(service provider)
       |                                            |                                 |
       |CONNECT request response information(1)     |                                 |
       |------------------------------------------->|                                 |
       |                                            |                                 |
       |CONNACK response topic(XXXX-XXXX-XXXX..)    |                                 |
       |<-------------------------------------------|                                 |
       |                                            | SUBSCRIBE service_topic         |
       |                                            |<--------------------------------|
       |                                            |                                 |
       |PUBLISH service_topic payload(request body) |                                 |
       |response topic(XXXX-XXXX-XXXX..)            |PUBLISH service_topic payload    |
       |------------------------------------------->|response topic(XXXX-XXXX-XXXX..) |
       |                                            |-------------------------------->|
       |                                            |                                 |
       |                                            |                                 |
       |                                            |PUBLISH XXXX-XXXX-XXXX...        |
       |PUBLISH XXXX-XXXX-XXXX...                   |payload(result)                  |
       |payload(result)                             |<--------------------------------|
       |<-------------------------------------------|                                 |
       |                                            |                                 |
kleunen commented 3 years ago

The client id should be unique on the broker.

So, we decided that client id within the broker should be: username + user supplied client_id

If the broker is generating the client id, it can generate a unique client id. The broker can generate a unique topic for request/reply based on unique client_id

redboltz commented 3 years ago

The client id should be unique on the broker.

So, we decided that client id within the broker should be: username + user supplied client_id

Yes. It violates MQTT spec but we choose this intentionally as we discussed.

If the broker is generating the client id, it can generate a unique client id. The broker can generate a unique topic for request/reply based on unique client_id

Do you mean for auto client_id generation, uuid is reasonable choice but for request/response topic, there is a better choice ?

For request/response, username + client_id can be used. ( It not essential that we can add response_ prefix ) client_id could be supplied by user, if it is empty, then broker generate it by uuid. Anyway username + client_id(user supplied or broker generated) can be used as response topic.

Am I understanding correclty?

kleunen commented 3 years ago

No, there is no better choice. Only thing is to remind you, there is a small relation between the login (username) and the client id. Because the client id has the username inside to make it unique.

So authorization / authentication and req / resp are unrelated, only except for the possible username that makes the user_id unique.

redboltz commented 3 years ago

I don't understand what you mean, not yet. I mean

  1. auto generated clinet_id is not related to authentication (and authorization).
  2. auto generated response topic (for MQTT's Request/Response mechanism) is related to authorization.

Maybe we agree about 1.

Let me explain why 2 is related to authentication. Auto generated topic name (UUID) for Request/Response is unpredictable. We need to integrate it to our authentication mechanism.

Auto generated topic 's subscriber is only one request sender. On the contrary, the publisher is any service provider. So allowing all clients is practical choise. However, there is no way to write auto generated topic name in the json file. So I said we need something tagging mechanism.

kleunen commented 3 years ago

However, there is no way to write auto generated topic name in the json file. So I said we need something tagging mechanism.

Why ? The broker can just generate a unique topic name, right? It does not have to be related to the logged in user ?

It is part of the login procedure, but the broker might generate an unique reply topic for each connect attempt as follows: reply-topic-1 (connect attempt 1) reply-topic-2 (connect attempt 2) reply-topic-3 (connect attempt 3) ...

It is related to the connection, or do I not understand correctly ?

redboltz commented 3 years ago

However, there is no way to write auto generated topic name in the json file. So I said we need something tagging mechanism.

Why ? The broker can just generate a unique topic name, right?

Yes.

It does not have to be related to the logged in user ?

What does logged in user mean? I assume that it is a client. It needs to be related to the client that sent Request Response Information. The client should only be able to subscribe the topic. Because it is a special topic to receive response.

It is part of the login procedure, but the broker might generate an unique reply topic for each connect attempt as follows: reply-topic-1 (connect attempt 1) reply-topic-2 (connect attempt 2) reply-topic-3 (connect attempt 3) ...

It is related to the connection, or do I not understand correctly ?

I don't understand that part yet. Could you elaborate this?

kleunen commented 3 years ago

I think I understand now, you only want to give the specific connection access to the special topic. Other users are not allowed to subscribe?

kleunen commented 3 years ago

First connection:

client(request sender)                             broker                   client(service provider)
       |                                            |                                 |
       |CONNECT request response information(1)     |                                 |
       |------------------------------------------->|                                 |
       |                                            |                                 |
       |CONNACK response topic(reply-topic-1)       |                                 |
       |<-------------------------------------------|                                 |

Second connection:

client(request sender)                             broker                   client(service provider)
       |                                            |                                 |
       |CONNECT request response information(1)     |                                 |
       |------------------------------------------->|                                 |
       |                                            |                                 |
       |CONNACK response topic(reply-topic-2)       |                                 |
       |<-------------------------------------------|                                 |

Third connection:

client(request sender)                             broker                   client(service provider)
       |                                            |                                 |
       |CONNECT request response information(1)     |                                 |
       |------------------------------------------->|                                 |
       |                                            |                                 |
       |CONNACK response topic(reply-topic-3)       |                                 |
       |<-------------------------------------------|                                 |
redboltz commented 3 years ago

Let me explain a concrete scenario. There are four clients (connections).

Request sender

Precondition

client3 and client4 are connected to the broker.### client3 subscribe service3 topic. client4 subscribe service4 topic.

Scenario

Connect phase

The client1 CONNECT with RequestResponseInformation(1) to the broker. The broker CONNACK with ResponseTopic (uuid1) to the client. The client1 SUBSCRIBE uuid1.

Request response phase

The client1 PUBLISH service3 with ResponseTopic (uuid1). The client3 receive it and do something then PUBLISH back to the response using topic uuid1.

The client1 PUBLISH service4 with ResponseTopic (uuid1). The client4 receive it and do something then PUBLISH back to the response using topic uuid1.

Analysis

client1 needs to SUBSCRIBE topic uuid1. client3 and client4 need to PUBLISH uuid1.

For security, the clients except client1 shouldn't subscribe the topic uuid1. Our authorization is based on Username, both client1 and client2 have the same UserName u1. I think that it is a reasonable compromisation point. We can assume that client1 and client2 know each other. So we don't need to reject client2's SUBSCRIBE uuid1.

client1 can request any other client with ResponseTopic uuid1. So all clients (all UserNames) can publish to uuid1.

NOTE

The property ResponseTopic is used as two purposes. One is sent by broker (CONNACK). Inform the generated topic to the client. The other is send by the client (PUBLISH). Notify to the service provider to the response topic. ("Please use this topic for response").

redboltz commented 3 years ago

First connection:

client(request sender)                             broker                   client(service provider)
       |                                            |                                 |
       |CONNECT request response information(1)     |                                 |
       |------------------------------------------->|                                 |
       |                                            |                                 |
       |CONNACK response topic(reply-topic-1)       |                                 |
       |<-------------------------------------------|                                 |

Second connection:

client(request sender)                             broker                   client(service provider)
       |                                            |                                 |
       |CONNECT request response information(1)     |                                 |
       |------------------------------------------->|                                 |
       |                                            |                                 |
       |CONNACK response topic(reply-topic-2)       |                                 |
       |<-------------------------------------------|                                 |

Third connection:

client(request sender)                             broker                   client(service provider)
       |                                            |                                 |
       |CONNECT request response information(1)     |                                 |
       |------------------------------------------->|                                 |
       |                                            |                                 |
       |CONNACK response topic(reply-topic-3)       |                                 |
       |<-------------------------------------------|                                 |

Thanks. I use uuid so far. And as you described, it generate different value every connection.

kleunen commented 3 years ago

For security I would say the following:

Connect phase

The client1 CONNECT with RequestResponseInformation(1) to the broker. The broker CONNACK with ResponseTopic (uuid1) to the client1. The broker gives client1 SUBSCRIBE access to uuid1 The client1 SUBSCRIBE uuid1.

The client2 CONNECT with RequestResponseInformation(1) to the broker. The broker CONNACK with ResponseTopic (uuid2) to the client12 The broker gives client2 SUBSCRIBE access to uuid2 The client2 SUBSCRIBE uuid2.

Access rights

If the other access is configured correctly: client1 can only subscribe to uuid1 client1 not allowed to subscribe to uuid2

client2 can only subscribe to uuid2 client2 not allowed to subscribe to uuid1

NOTE: This is valid until connection terminates. Subscribe access is removed when connection (client1 or client2) terminates.

redboltz commented 3 years ago

I understand. If it could be achieved by practical effort and complexity, it is good. I'm worrying about out authentication mechanism is based on UserName not UserName+ClientId. I think that it is a good and practical decision. My scenario, client1 and client2 have the same UserName u1. So I guess that it is difficult to implement that applying the different rule for client1 and client2.

I think that the combination of UUID (unpredictable) and UserName based authentication is practical compromisation.

One user process has two connection those are client1 and client2. The process get uuid1 from client1's CONNACK packet. The process tells uuid1 to client2. Then client2 subscribe uuid1. It is not rejected. Client1 and client2 share UserName and Password, so it is good enough.

What do you think?

kleunen commented 3 years ago

One user process has two connection those are client1 and client2. The process get uuid1 from client1's CONNACK packet. The process tells uuid1 to client2. Then client2 subscribe uuid1. It is not rejected. Client1 and client2 share UserName and Password, so it is good enough.

I would not recommend this. I think it is easier to give a connection or client access to a specific topic, for as long as the connection lasts. Because the authentication system, based on which UserName is logged in, it will get access to specific topics. On top of this, the response topic is added. Once the connection terminates, the access table for this connection is cleared.

This way, you could also generate the response topic based on the connection number. Different connections can not subscribe to each others response topics. Even if they have the same UserName logged in.

I would say there is no point in one connection subscribing to the response topic of another connection. Because if you would like to do this, you should just use the normal pub/sub pattern, not the req/resp communication pattern.

redboltz commented 3 years ago

I guess that I understand.

We've discussed the semantics of the authentication and authorization. And its json syntax, so far. My scope was how to implement request/response topic authorization using that semantics.

I said that we need to treat request/response topic is one of special case. I used the word "tagging".

I think that you already have implementation outline. And you think that connection (client) base authentication is not so difficult to implement.

I guess that your implementation could have connection base authentication. If the multiple connections have the same UserName, then get the same authentication result for normal topics. And for request/response topic, connection base authentication is applied. Even if the connections has the same UserName, the authentication result can be different. So the logic is different from normal topics. That means I said special case.

Am I understanding correctly?

connection UserName topic result
c1 u1 normal topics judged by u1 and topic
c2 u1 normal topics always same as above
c3 u2 request/response topic judged by u2+c3(or connection (session) object on the broker, it's the same) and topic
c4 u2 request/response topic different from above
kleunen commented 3 years ago

So the logic is different from normal topics. That means I said special case.

Yes, that is correct.

I think that you already have implementation outline. And you think that connection (client) base authentication is not so difficult to implement.

Yes, i already have a view on how the authentication will be implemented.

The broker has a global subscription_map, which maps a subscription -> [ client ] (list of clients) https://github.com/redboltz/mqtt_cpp/blob/master/include/mqtt/broker/broker.hpp#L2157

The broker needs an additional authentication map, which maps subscription -> [ client ] (list of clients) (auth_con_map)

Once a topic is published

Only clients which are both in the subscription map and the authentication map, will receive the message.

redboltz commented 3 years ago

I just worried about implementation complexity. But I think that your design doesn't need to big modification to support connection based authorization for request/response topic. It's nice!

Let's move on this way.

kleunen commented 3 years ago

But what you think of an additional method ('anonymous'), next to 'password' and 'client_cert' to specify anonymous user ?


  "authentication": [
        {
            "name": "anonymous",
            "method": "anonymous"
        }
    ]
redboltz commented 3 years ago

We need to define what anonymous user/connection mean.

  1. CONNECT packet UserName: "anonymous"
  2. CONNECT without UserName fileds
  3. CONNECT with UserName: "" (empty string)
kleunen commented 3 years ago

I would say only 2

redboltz commented 3 years ago

Thanks. Do you treat 3 "" as the same as normal user like "u1" ?

redboltz commented 3 years ago

Making hash configurable is good I would say, maybe default to sha256 indeed.

I was thinking something like:

  "authentication": [
        {
            "name": "anonymous",
            "method": "anonymous"
        }
    ]

I understand "name" : "anonymous" means no UserName field. What does "method" : "anonymous" mean ? No authentication?

The following combinations are allowed?

  "authentication": [
        {
            "name": "anonymous",
            "method": "password",
            "password":"mypassword"
        }
    ]
  "authentication": [
        {
            "name": "user1",
            "method": "anonymous"
        }
    ]

or possibly:

  "config": [
        {
            "hash": "sha-256",
            "salt": "1234", 
            "anonymous":"anonymous"
        }
    ]

You define for a user the method is anonymous, basicly specifying what the anonymous username should be. (Only one anonymous user allowed) .

What does "anonymous":"anonymous" mean? The same as follows?

  "authentication": [
        {
            "name": "anonymous",
            "method": "anonymous"
        }
    ]
kleunen commented 3 years ago

Thanks. Do you treat 3 "" as the same as normal user like "u1" ?

Yes

kleunen commented 3 years ago

What does "method" : "anonymous" mean ? No authentication?

yes, exactly. No username/password specified on connect.

I understand "name" : "anonymous" means no UserName field.

Well, i would say: every user in the json file needs a username. Otherwise you can not add then to a group for example. This is the name of this user

The following combinations are allowed?

  "authentication": [
        {
            "name": "anonymous",
            "method": "password",
            "password":"mypassword"
        }
    ]
  "authentication": [
        {
            "name": "user1",
            "method": "anonymous"
        }
    ]

Yes, that is also possible. method "anonymous" just means: the specified user can only login by not specifying a username/password on connect.

redboltz commented 3 years ago

Thanks. I think it is good.

But what you think of an additional method ('anonymous'), next to 'password' and 'client_cert' to specify anonymous user ?

You mentioned client_cert. I think that client_cert is different.

I think that MQTT's UserName and Password meaningless for Client Certification authentication. UserName and Password should be ignored or if it is set, authentication should be failed. Because if someone get a valid client cert, the one can spoof using different UserName. On the contrary, ClientId can be used the same as password authentication. Just distinguish session, not related to the authentication.

UserName should be extract from signed body of the Client Certification. At first, I said CNAME is good. But someone said the field should be customizable.

So the entry should as follows:

 "authentication": [
        {
            "name": "user1", //  you can write "" or  "anonymous" here
            "method": "client_cert",
            "field":"CNAME" // extraction source field that contains "user1"
        }
    ]

What do you think?

kleunen commented 3 years ago

So the entry should as follows:

 "authentication": [
        {
            "name": "user1", //  you can write "" or  "anonymous" here
            "method": "client_cert",
            "field":"CNAME" // extraction source field that contains "user1"
        }
    ]

Yes, I agree. And indeed, i think indeed the login should fail if username / password is specified with certificate.

redboltz commented 3 years ago

I think that now we are ready to implement. Please let me know if I look over something.

kleunen commented 3 years ago

Yes, i think so too.

Here you can try some of the configuration options: https://wandbox.org/permlink/vGHDg4IBv4XLJz1f

redboltz commented 3 years ago

Additional information about Request/Response. https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901252

reply-topic-1 (connect attempt 1) reply-topic-2 (connect attempt 2) reply-topic-3 (connect attempt 3)

The behavior is up to broker's implementation but the following implementation is the most user friendly, I think.

When the broker receives CONNECT with RequestResponseInformation(1), then applies the following logic.

if (clean_start) {
    allocate the new uuid for response_topic.
    store the topic to the session_state
}
else {
    if the session_state that is corresponding to the client_id (it will become UserName+ClientId), 
    and the session_state has response_topic, then return it.
    otherwise allocate the new uuid for response_topic and store it to session_state
}

So response_topic can exist while session_state exists. Similar to the normal topic, the response topic can be retained, storing message after disconnection, and restore the message on reconnection.

It is implemented by #894

kleunen commented 3 years ago

Yes, seems to make sense.

This connect_handler and some other methods within broker are getting very big. So it might be wise to split up some of these methods.