[mqtt] Allow complex topic names in the MQTT Endpoint

JulianFeinauer commented 2 years ago

Especially for "legacy" applications migrating from an MQTT Broker to Drogue it could be hard to go from a complex topic structure to the "simple" topic structure there currently is.

If we would just allow these topics to still contain their complex structure ("a/b/c/d") but transform them as IS to the subject we could at least make the Produer Side transparent in this regard.

Of course one needs to take special care on the Consumer side as e.g. in case of the MQTT Integration / Consumer users may expect to be able to use patterns on topics (but as far as I have seen you currently can only subscribe on devices and get the full CloudEvents JSON Encoded as payload so it also would not matter in that regard).

ctron commented 2 years ago

I think that is again a valid use.

Currently we have <channel> (for directly connected devices) and <channel>/<device> (for devices using a gateway), which might make this hard.

However, that doesn't mean we shouldn't change it! We just need to find a clever way to deal with devices and proxied devices.

JulianFeinauer commented 2 years ago

Ah, yes, the proxy makes this a tiny bit more challenging. But in the proxy case we could also reverse it and say <device>/<channel>. This would be a major change of course but would make it easy to allow channels to have also multiple hierarchies. Or am I missing something? (it would also work the other way round but I think this way its a bit cleaner probably).

ctron commented 2 years ago

No I think that would be a way to deal with this. @garyedwards what do you think?

At the moment, I am not so much afraid of a major change like that. But that we corner ourselves by sticking too much to history.

How would that look then, in case a device would publish for itself? Something like /<channel> (leading slash)?

JulianFeinauer commented 2 years ago

I just checked the way its currently handled. Currently, if a "/" is found (split yields 2 entries) we check if the latter is a valid and authenticated device (as far as I understand the code).

So we could also extend the same logic and say if the first part of a message <potential_device>/<rest containing more slashes>, i.e. the potential_device can be identified as a device (by a lookup in the device cache) then we could decide that this is a proxy with device target.

On the other hand.. isnt there any way to detect if the auth information comes from a gateway? Then we could simply always assume that the first part is a device (and use the same logic as currently)?

garyedwards commented 2 years ago

I think the proposed <device>/<channel> does make sense visually for namespacing sub topics under the devices themselves.

I am not a fan of the leading slash as this implies an absolute path. ROS would handle this with ~channel which implies a private topic and may look familiar to people who use the command line / shell.

However both of these I feel are error prone as forgetting the leading character will result in an error. Having said that the example @JulianFeinauer gave is also an error with the current behaviour. I think <channel>/<device> may give a slightly better new user experience.

<channel>/<device> is the same order as how Hono handles gateway device topics, which is not that important, but aligns with the sister project for consistency.

I don't think either one is perfect and could be argued either way as we have done in the past.

Could the current <channel>/<device> support escaping slashes so that paths could be provided and included in the resulting subject attribute? e.g. a\/b\/c\/d or support using html codes a&#47b&#47c&#47d etc.

JulianFeinauer commented 2 years ago

Thanks for your thoughts @garyedwards. I can not comment well on the gateway use case as I never had one basically so my personal opinion is that <device>/<topic> is more intuitive (like e.g. in scp where you say <host>:<path>) but you all may have more experience and opinion on that.

My major point is that I would like to make Drogue behave transparent for regular (non-gateway) MQTT Senders which would need to allow more than on slasch for a topic.

But another alternative could be to use another character instead of the slash. It seems like the MQTT 3.1 spec allows quite a range of charachters (https://iot.stackexchange.com/questions/966/should-i-use-special-characters-in-mqtt-topics) so we could also do <device>:<topic> or even <device>@<topic> (or use + or # as dividers as no sane developer would ever use these as part of topic names).

ctron commented 2 years ago

I think that all arguments are valid. It is a tough decision to get it "right", that is why I really appreciate all your input!

It would be cool if we could add some kind of indicator to the MQTT endpoint, and let the endpoint know what variant the device favors. Like a "dialect". We want to keep Hono. We want to evolve the API. We want to allow different topic structures.

With MQTT 5, we could use some property when connecting. You configure this once during the setup process of the device, and that is it.

However, MQTT 3.1.1 is still the dominant version I guess. So that wouldn't help us much. We need an alternate way to identify which MQTT dialect the device wants to use.

Topic structure

One way to deal with this could be to add a topic prefix:

v1/publish/<channel>/<device>
v2/publish/<device>/<all-the-segments-for-channel>

But taking a look at Hono, we run into issues again. As that prefix is unknown. Same for other systems.

Connection fields

Another way could be to re-use one of the fields we have when connecting. e.g. the username being <app>@<device> for standard "current" behavior. And v1@<app>@<device> for dialect "v1".

Of course that won't work when using X.509 client certificates. Maybe there is an alternative here too.

Publish settings

A more complex alternative could be to let the device select the dialect, by publishing a specific "setup" message. However again, the conflicts with variants like Hono which don't really know about that.

Configuration

We could also put this into the configuration of either the application, or the device. Something like this:

spec:
  mqttEndpoint:
    dialect: drogueV1 # or hono

Conclusion

To me it feels like using the username approach is the best so far. As that is something the device can/must choose and is configurable in most cases anyway. And it aligns with the MQTT 5 idea of properties during the connection phase.

garyedwards commented 2 years ago

I suspect in most cases the dialect would be consistent for an application if not the whole Drogue instance. As such making it selectable in configuration might be the cleanest and most consistent from an end user's perspective.

I think the username is a nice idea but could add confusion when implementing at a device level as this is often done by 3rd parties in my experience.

ctron commented 2 years ago

Sounds like a good approach too.

I think, in order to make some progress, here is way that should work:

Create a new enum (e.g.) Dialect, which allows us to distinguish this
Implement a new Session (alongside the current one) with the new behavior
Switch in the Application to either the old or new one, decided using the new enum above
hardcode the enum value for now

Next, we can:

Configure the enum using env-vars (so that we can start up different pods with different configurations)
Optionally make to configurable from the application and/or device spec section

The other options should still be possible. Although they have some drawbacks.

JulianFeinauer commented 2 years ago

@ctron after having a more detailed look at the code I am not sure if its really necessary to implement a new Session. I think only the Session::eval_device function is affected. So we could just have two different static functions on a SessionHandler which are selected based on the Dialect.

In another language I would even say, we could point the Dialect to the appropriate implementing function but I am not sure how this would be done in Rust. One possibility would be to pass a closure or function to the struct as Handler. Then we could just call the selected dialect.

Does that make sense?

drogue-iot / drogue-cloud