eclipse-cyclonedds / cyclonedds

Eclipse Cyclone DDS project
https://projects.eclipse.org/projects/iot.cyclonedds
Other
846 stars 350 forks source link

security profile #102

Closed PradeepKiruvale closed 4 years ago

PradeepKiruvale commented 5 years ago

Hi All,

I am new to cyclonedds and looking for security profile related things. I would like to know the kind of security features that are supported by cyclonedds. Please let me know also if there are any example, how to use them.

Thanks & Regards, Pradeep

eboasson commented 5 years ago

Hi Pradeep,

Cyclone doesn’t yet implement DDS Security, so the only options you have today are TCP+TLS for authenticated and encrypted (point-to-point) connections, and running over a VPN if you trust all nodes that are part of the VPN. A VPN-based solution requires no changes to the Cyclone configuration, the use of TCP+TLS does. If you’re interested I could whip up a minimal example — I don’t think it is documented anywhere at the moment (other than in the code).

For the longer term, DDS Security is obviously the goal. As implementing DDS Security is a pretty significant amount of work and I moreover have reason to believe that ADLINK intends to donate the OpenSplice implementation to Cyclone in the future (there is a fair bit of similarity between the DDSI stack in OpenSplice and the one in Cyclone), I prefer to focus on improving other aspects of Cyclone first.

PradeepKiruvale commented 5 years ago

Thanks for the clarification. There is some code in src/core/ddsi/src/q_security.c, any idea about this?

It would be very helpful if you can get me some example, to show how to use TCP+TLS in cyclonedds.

Also its interesting to know what are the other aspects you think the cyclonedds lacks and needs improvement, anything in specific?

eboasson commented 5 years ago

It would be very helpful if you can get me some example, to show how to use TCP+TLS in cyclonedds.

I won’t be able to do that until next week, for the simple reason that I don’t have the things I need for that at hand this weekend. In these regards, Cyclone configuration is still pretty much the same as OpenSplice configuration, so if you are in a hurry, you might be able to find some information there as well.

Thanks for the clarification. There is some code in src/core/ddsi/src/q_security.c, any idea about this?

The q_security.c file is a bit of an old long-since abandoned thing, going back many years and that somehow managed to survive. There are a number of problems with it ... Firstly, it should properly be named “encryption” rather than security, as it does nothing beyond that. Secondly, it is only encryption with a blockcipher using a pre-shared key, and each packet is encrypted individually (a very bad practice). Thirdly, it only encrypts application data, but not discovery information; and to add insult to injury, it will happily accept unencrypted data.

So the only thing it is good for is protecting the data from casual eavesdroppers. (To be fair to it, that was actually the only thing it was meant to do ... but it does suggest something far grander and that’s not fair, in my opinion.)

Any effort put into attempts at improving it would be wasted effort as it would simply migrate in the direction of the DDS Security spec while remaining incompatible. My advice is to ignore it, and my intent is to remove it.

Also its interesting to know what are the other aspects you think the cyclonedds lacks and needs improvement, anything in specific?

A number of things have been discussed in other issues, but certainly not all. For some use cases it is already sufficient, but the goal is to end up with a pretty full-featured DDS implementation. That means a lot of things, in no particular order and no doubt with a omission of a several other things: x-types (work is being done on that), DDS security, other language bindings (“native” C++, Python, Java, C#, Haskell, Rust, Ada ... well, maybe not all at the same time!), ability to shrink it for use in small embedded systems, support for using multiple network interface simultaneously, reasonably complete QoS support, transient- and persistent data support.

Probably there is more, but it should be clear that there is plenty of work to be done — though not all of it is equally invasive. For example, language bindings will basically sit on top of the C API; the core library code deals in abstract samples, so x-types does not affect the core very much; at the moment I think DDS security will be donated in a reasonable time frame; a lot of the QoS settings are pretty much local affairs that can be added reasonably easily.

Naturally, as this is a true open source project, I am really hoping that over time other people will start investing in it and contributing towards this goal. Especially something like a fully featured Python binding would be very welcome and greatly help with the ease of use, testing, experimentation, ... while not requiring much detailed knowledge of the inner workings. Completing the ROS2 RMW layer (or writing a new one) is also one on that list.

Anyway, please don’t be scared that even a partial list is so long already. What is there works, apart from some bugs (like with #87) that are being worked on.

PradeepKiruvale commented 5 years ago

Thanks for all the clarifications!. I am not in a hurry for the example, take your time.

eboasson commented 5 years ago

I tried to enable it and discovered to my dismay that the SSL support in the code was not actually enabled at build time — and of course that meant it had accrued some compilation issues. On going over it I spotted another few things, such as not disabling old SSL/TLS versions and not enabling certificate checks by default ... That led to PR #107 that at least brings it to a working state. I'm holding off on merging that PR for a while, perhaps someone knowledgeable will comment. Nothing worse than incorrectly suggesting something can safely be used in a hostile environment.

As I am no expert in using openssl but did feel my example configuration should go a bit further than simple self-signed certificates, I followed this guide to setting up a CA and generating client and server certificates. From my limited testing it appears that it indeed accepts a connection from a peer with a certificate that was signed by the CA and refuses to establish a connection with peers that do not. That's good, but not proof of it really doing everything correctly. So please think twice before deploying to a public-facing machine ...

In the linked zip file you will find a server and client configuration file (DDS may be peer-to-peer, but TCP and TLS are not). These rely on "key stores" that include a private/public key for the process and a CA (unless you choose to accept self-signed certificates, in which case you are only getting protection against eavesdroppers), and the redtape script automates the process I referenced above for generating those files.

cyclonedds-ssl-config.zip

PradeepKiruvale commented 5 years ago

Thanks a lot for the example. I will have a look. As of now I am looking how to implement Authorization and Encryption part of security in cycloneDDS.

eboasson commented 5 years ago

As of now I am looking how to implement Authorization and Encryption part of security in cycloneDDS.

That is great. Any contribution towards improving the security of Cyclone DDS is appreciated very much.

I've merged the PR, so as of today, Cyclone DDS can be built (it can be completely disabled, but is done by default) with openssl 1.0.x or 1.1.x. It defaults to requiring TLS 1.3 for the connections but can be configured to accept TLS 1.2. The reason for pushing the default minimal TLS version all the way up is that there is no installed base yet, and that it therefore makes sense to default to the (presumably) most attack-resistant protocol version.

PradeepKiruvale commented 5 years ago

Thanks for the clarification. I would also like to know that which level its better to encrypt the messages. Is it good to do at DDS level or RTPS(ddsi) level?

eboasson commented 5 years ago

Hi @PRADEEPKJ, I'm not sure exactly what you mean. So I will guess a bit and give you my view on the overall picture.

Firstly, it depends on what you want to protect against. The short answer is that it is generally better to have enough protection in the lower layers of the stack to efficiently ignore all non-genuine data from (authenticated) peers. So that pleads for at least some protections at the RTPS level. There are also reasonable cases where encryption should already occur at the DDS level, with the RTPS level merely carrying encrypted blobs. In some cases, this even requires encrypting data twice.

You really cannot discuss the trade-offs involved without the exercise of modelling the threats you want to protect against, because that's the key thing in weighing the options and their consequences. The DDS Security spec contains a discussion of threats and the available options and is a good place to start when considering a question such as this one. [*]

Secondly, there are some observations that are always relevant. The first, obvious, one is the "meta data" vs "data" discussion that became very public with Snowden and hasn't gone away since. There are various levels of meta data that you might care about: there is the low-level networking part, which allows you to determine the data flows between machines. You could deliberately build a system that hides this by adding traffic, but I've never yet heard of a anyone wanting to go down that path for a DDS system.

One level up, there is information on which participants, readers and writers exist. You may or may not care about eavesdroppers being able to discern the structure of your system in terms of participants, topics, readers and writers — and if you don't care, you can leave this information in the clear. However, a related matter is whether you are willing to perform discovery with anyone — quite often, one doesn't want that (how bad it is depends in part on your choice of QoS), and so you will probably want the ability of authenticating the participants and validating that the other discovery data you receive hasn't been spoofed.

Similarly, there is the meta data of the actual user traffic — perhaps you don't care that someone can determine which readers and writers exist, but you do care that someone can analyse the patterns in the traffic without having access to the contents of the application data. You therefore have to consider whether the message headers need to be encrypted (that would Data/DataFrag, AckNack/NackFrag, Hearbeat and HeartbeatFrag). Again you probably also want to protect against spoofing and replay attacks.

And so on and so forth ...

For actually protecting the contents of the data, the ultimate is of course that the application does everything itself while running in a secure partition and only passes encrypted blobs through a secure channel to the middleware running in another partition. These blobs can then contain everything you need to verify the sender is authorised, that the data hasn't been tampered with, that it isn't a replay, &c. If this is all you do, you'd be leaking tons of meta data and you open yourself up to denial of service attacks — especially if you use deep histories, resource limits or variable sized data — because the lower-level networking stack would have to run its reliable protocol and enforce resource limits without having any way of determining whether a remote node is trustworthy or not. But the contents of your data would be as safe as your application chooses to make it.

Another downside is that DDS would not see the contents anymore, and so all the content filtering and other things go overboard. You can probably do some really interesting things by using homomorphic encryption. Performance would probably be terrible, but it'd definitely be interesting 😁.

Did that help you a little bit?

[*] In its current version, the DDS Security spec has a really interesting shortcoming that I think directly relates to your question. The specification envisions a situation where the application does not fully trust the middleware, and so allows for the existence of middleware services that can store and forward data without being able to decrypt it (so true end-to-end encryption). This is a particularly nice attribute for transient/persistent data, but (and this is the flaw) the key used for encrypting the data disappears with the disappearance of the writer, and so a reader created afterward may be authorised to read the data stored by the middleware for specific purpose of providing such a reader with the data, but it cannot decrypt it either because the key was thrown away ... Oops.

Just to be clear: all the mechanisms work fine for volatile and transient-local data, and all available mechanisms work fine for transient/persistent data if you are willing to trust the middleware with the keys to the data. And I do believe that the plug-in based system for authentication, access control, encryption, key management and what not, allows building a really complicated plug-in that would store those keys in separate tamper resistent locations ...

And so it is not a fatal flaw, just an unfortunate interaction between features that prevent a particularly attractive combination from working.

PradeepKiruvale commented 5 years ago

Thanks for the detailed explanation. Yes, these information are very helpful for my future work.

PradeepKiruvale commented 5 years ago

Any idea how to enable the SSL module while compiling?. I do no see any switches in cmake file.

eboasson commented 5 years ago

The default behaviour is to require openssl and include support, but you can manually disable it. The CMake bit is buried in the src/core/CMakeLists.txt because the setting only affects the core library. See: https://github.com/eclipse-cyclonedds/cyclonedds/blob/master/src/core/CMakeLists.txt#L36

PradeepKiruvale commented 5 years ago

Thanks, it worked. I ran helloworld example with ssl enabled, it works fine.

PradeepKiruvale commented 5 years ago

One more question, the existing ssl support is only for the tcp ( reliable transport ) right?. What about when we use the Best_effort QoS?

eboasson commented 5 years ago

Thanks, it worked. I ran helloworld example with ssl enabled, it works fine.

☺️

One more question, the existing ssl support is only for the tcp ( reliable transport ) right?. What about when we use the Best_effort QoS?

When you use TCP, it pushes everything over the TCP connection (and if you enable TLS, everything goes over TCP+TLS) — so that includes the best-effort data. Evidently, that is less than ideal, because it means you get, e.g., head-of-line blocking behaviour on best-effort data, but there is not that much choice either.

Besides that, you might find it interesting to know that it runs the DDSI reliable protocol on top of TCP. In part, that's because it is by far the easiest way to deal with it, but in part it is also because the behaviour of and timeouts in DDS for detecting disconnections and handling reconnections is different from those of TCP, and so one has to deal with cases where a TCP connection drops but the connection is re-established before the peer's lease expired.

Now, in Cyclone, there are a number of known issues with the TCP support, and while addressing those won't change the fundamentals, it will provide a much better experience. Right now you can run into connection timeouts in the data path (whereas they should occur outside the data path, while doing discovery), and it doesn't gracefully deal with a blocked TCP connection, as the data will stall, rather than continuing unhindered on all connections other than the one that is stalled. These are issues that will be fixed in the future.

PradeepKiruvale commented 5 years ago

@eboasson I successfully enabled and build the security code. Using some clues from security config example from the Adlink security document I tried running helloworld example. But facing the below issue. My config file looks as below. Getting a error like " Not a multicast address". Please let me know what could be the reason and how can I resolve this.

eboasson commented 5 years ago

Hi @PRADEEPKJ, I expect it is complaining about the addresses you specified in the "address" attributes of the NetworkPartition elements. It requires those to be multicast addresses, as the main purpose of the network partitions is to allow mapping traffic to different multicast addresses depending on topic/partition. The address to use for unicast traffic is simply the taken from the selected interface.

Readers matching a network partition advertise the address from the network partition as the multicast address on which they will be listening for data, forcing the writers to use that multicast address instead of some other address. If different processes use different configurations with different addresses, the data will simply be sent multiple times.

It seems to me that the combination of network partitions and unicast-only networking (General/AllowMulticast = false) might cause some trouble. That case is not really relevant when network partitions are used only for mapping traffic to multicast addresses, but that changes when you add this feature. It probably ignores the multicast addresses somewhere down the line and so it will probably work, but if not, the fix will be trivial.

PradeepKiruvale commented 5 years ago

Hi @PRADEEPKJ, I expect it is complaining about the addresses you specified in the "address" attributes of the NetworkPartition elements. It requires those to be multicast addresses, as the main purpose of the network partitions is to allow mapping traffic to different multicast addresses depending on topic/partition. The address to use for unicast traffic is simply the taken from the selected interface.

Readers matching a network partition advertise the address from the network partition as the multicast address on which they will be listening for data, forcing the writers to use that multicast address instead of some other address. If different processes use different configurations with different addresses, the data will simply be sent multiple times.

It seems to me that the combination of network partitions and unicast-only networking (General/AllowMulticast = false) might cause some trouble. That case is not really relevant when network partitions are used only for mapping traffic to multicast addresses, but that changes when you add this feature. It probably ignores the multicast addresses somewhere down the line and so it will probably work, but if not, the fix will be trivial.

PradeepKiruvale commented 5 years ago

Does it mean that the existing security code (With network partition) works only for the multicast, i.e only UDP, not TCP right?

eboasson commented 5 years ago

I think no one has given any thought to it — I certainly haven't! — but from the way this DDSI stack works, there should be no issue. For TCP there'll probably be an issue defining the network partitions simply because it requires a multicast address.

Make it possible to define a network partition without using a special address (perfectly reasonable in the light of this feature) and I think it will work over TCP and over UDP, because it seems that during creation of a reader, the configured addresses of matching network partitions are collected in an "address set", and that this address set may be empty. It's worth looking into in more detail.

PradeepKiruvale commented 5 years ago

@eboasson Thanks for the clarification. Yes, that is right the address set is empty so I have an issue. One more question. Once the TLS/DTLS is enabled the communication channel is secure and all the messages will be encrypted by default. Why there is a separate encryption mechanism?. I fail to understand this.

eboasson commented 5 years ago

Hi @PRADEEPKJ ,

Once the TLS/DTLS is enabled the communication channel is secure and all the messages will be encrypted by default. Why there is a separate encryption mechanism?. I fail to understand this.

I don't think the additional encryption would be valuable. What I am afraid happened is that I have unintentionally created some confusion by trying to explain that this mechanism is independent of the choice of transport, except for a few details at configuration level, thereby unintentionally leaving a suggesting that it might be of value.

And once more: the encryption code in q_security.c does not make it a secure DDS. It only provides for encryption of individual data samples with a fixed, pre-shared key and does not protect against tampering, replay attacks, spoofing, or any other bit of fun.

I fully intend to remove it sooner than later and might have done so already had you not shown a significant interest in it — perhaps you saw something in it that I overlooked, after all. Also keep in mind that support for the real DDS Security specification really is coming to Cyclone (see #132), and that one eliminates whatever small value there might once have been in this ancient bit of code.

PradeepKiruvale commented 5 years ago

Hi @PRADEEPKJ ,

Once the TLS/DTLS is enabled the communication channel is secure and all the messages will be encrypted by default. Why there is a separate encryption mechanism?. I fail to understand this.

I don't think the additional encryption would be valuable. What I am afraid happened is that I have unintentionally created some confusion by trying to explain that this mechanism is independent of the choice of transport, except for a few details at configuration level, thereby unintentionally leaving a suggesting that it might be of value.

OK

And once more: the encryption code in q_security.c does not make it a secure DDS. It only provides for encryption of individual data samples with a fixed, pre-shared key and does not protect against tampering, replay attacks, spoofing, or any other bit of fun.

OK

I fully intend to remove it sooner than later and might have done so already had you not shown a significant interest in it — perhaps you saw something in it that I overlooked, after all. Also keep in mind that support for the real DDS Security specification really is coming to Cyclone (see #132), and that one eliminates whatever small value there might once have been in this ancient bit of code.

Thanks for sharing the link to the security work. It is really helpful.