Open victomteng1997 opened 3 years ago
Thanks for reporting about these two flaws @victomteng1997, I volunteered to help triage them and attempt pushing forward mitigations. Let me comment on both:
Using the expired access control policy files.
During the implementation and testing of SROS2 features, I notice that permission files are protected by the signature linked to the certs of the corresponding nodes. When a permission file is generated, the administrator just replace the old one with it to apply the new access control policies. In fact, the old permission files are still usable since they also contains the valid signature. This brings a big security concern in multi robot systems, as a robot can use the "expired" permission files to bypass the current policies.
The only mitigation now is to assign new cert/key pairs to nodes whenever their policies are updated. This is quite tedious and difficult to manage. I saw that in issue #21 there were some keyword on the revocation of keys, but seems like this feature is not in the current implementation.
This seems about right. So that we can confirm this experimentally, can you @victomteng1997 facilitate a reproducible PoC of the flaw? Please facilitate a Dockerfile and the corresponding exploit. See this example (and the corresponding exploits) for inspiration. Can you please provide this in a simple computational graph (a simple setup with talkers/listeners should do) so that we can move forward?
Once you provide it, we'll triage this together and I'll help scalating this ticket. If appropriate, we'll get you a valid CVE ID so that this research is credited appropriately (to you).
Bypass access control policies through continuous connection.
The idea is that the current SROS2 access control implementation can only restrict the new connections, but it cannot restrict the connections that have been established. If a node has publish access to a topic, once it starts the publication process, setting new permission files does not stop it from publishing to that topic. The adversary node can gain consistent access to a topic (publish or subscribe) as long as the service starts.
This problem can be resolved through a forced restart of nodes, which is applicable to systems in closed environment. However, I do see the potential security issues if SROS2 is implemented in some cloud-based multi-robot systems. Since MRS is definitely one part of the ROS2 ecosystem, some protections should be implemented, or at least declared in the threat model.
The flaw described in here is unclear to me. What's the entry point[^0]? Are you suggesting we should consider as an entry point a rogue node (that's assumed compromised)? If so, what's the attack vector[^1]? Simply accessing the computational graph from the rogue node (who has permissions to interact with a given topic/abstraction)?
If so, this is a common pattern in security (though not the lowest hanging fruit for us security-wise). The security model could be updated to mitigate this issue but, what's your specific ask? To implement a mechanism to selectively remove on-the-go Nodes from the access control policies giving defenders the capability to hot-patch things?
[^0]: Entry points are specific areas in your robot compute architecture from where an actor could initiate attacks. [^1]: An attack vector is a path that an attacker could follow to peoform an attack on the system typically involving an entry point.
hey @victomteng1997, happy new year! Any chance you have an update for us on this? Would be great to go together through your PoCs and further triage this.
@vmayoral Sure. Was busy with some other paper things recently. I'll wrap up everything and let update it here. Sorry for the delay.
No problem @victomteng1997, I'll stay tuned but ping me once you land the contributions in here and we'll have a closer look at it together.
SROS2_PoC_Docker.zip README.md
Sorry for the late response. The README.md file contains the PoC steps and some simple analysis. The zip file contains the Dockerfile and two simple access control permissions for the demonstration of the attack. Please kindly check if you can reproduce the issues.
@victomteng1997, to give you an udpate, I'm looking at this as we speak, expect feedback from my side soon-ish.
I would like to discuss two potential vulnerabilities due to the lack of certificate revocation process in SROS2. I've done some basic research works and experiments, but please do correct me if I'm wrong.
Using the expired access control policy files.
During the implementation and testing of SROS2 features, I notice that permission files are protected by the signature linked to the certs of the corresponding nodes. When a permission file is generated, the administrator just replace the old one with it to apply the new access control policies. In fact, the old permission files are still usable since they also contains the valid signature. This brings a big security concern in multi robot systems, as a robot can use the "expired" permission files to bypass the current policies.
The only mitigation now is to assign new cert/key pairs to nodes whenever their policies are updated. This is quite tedious and difficult to manage. I saw that in issue #21 there were some keyword on the revocation of keys, but seems like this feature is not in the current implementation.
Bypass access control policies through continuous connection.
Another interesting vulnerability is also about access control. This problem was also discussed in the recent IoT security paper: https://homes.luddy.indiana.edu/luyixing/bib/oakland20-mqtt.pdf (
Non-updated session subscription state
problem in this paper)The idea is that the current SROS2 access control implementation can only restrict the new connections, but it cannot restrict the connections that have been established. If a node has publish access to a topic, once it starts the publication process, setting new permission files does not stop it from publishing to that topic. The adversary node can gain consistent access to a topic (publish or subscribe) as long as the service starts.
This problem can be resolved through a forced restart of nodes, which is applicable to systems in closed environment. However, I do see the potential security issues if SROS2 is implemented in some cloud-based multi-robot systems. Since MRS is definitely one part of the ROS2 ecosystem, some protections should be implemented, or at least declared in the threat model.