gwu-cs-iot / collaboration

Spring '20 IoT - systems and security class. This is the collaborative half of the class.

https://www2.seas.gwu.edu/~gparmer/classes/2020-01-01-Internet-of-Things-Systems-Security.html

MIT License

14 stars 26 forks source link

Paper Discussion: 5b. LogSafe: Secure and Scalable Data Logger for IoT Devices #43

Open rachellkm opened 4 years ago

rachellkm commented 4 years ago

Please add your feedback and reviews below.

huachuan commented 4 years ago

Reviewer: Huachuan Wang Review Type: Critique

Overview

Large amounts of personal information generated from IoT devices need to prevent misuse. This paper proposed LogSafe, which can store this information securely to perform forensic analysis. LogSafe is a scalable and fault-tolerant logger than can run on an untrusted cloud infrastructure and satisfies Confidentiality, Integrity, and Availability (CIA) security properties.

Contribution

This paper presented LogSafe, which uses cloud infrastructure and controls the system’s security. LogSafe can perform online computation on logged data while preserving privacy. It can accommodate a large number of IoT devices, a scalable snapshot algorithm to defend against attacks without compromising the logger’s performance. It performance evaluation of the logger implementation to test its scalability properties

Questions

The author states that the setup time is significantly higher if the Logger needs to be provisioned. Logger provision is a one-time cost. How about the Device Negotiate? Without prevision. Is it also a one time cost? What is the benefit of using provision?
In TLS session resumption part, the author mentioned that the existing device could reuse the previous session to reduce overhead. However, the session cannot be used forever to ensure forward secrecy. The TLS configuration time is once every hour, does that mean within this hour, the forward secrecy cannot be ensured?

Critique

The IoT device might better be set to have the capability to reuse a session up to N times instead of setting the time limit.
Table II provides the average time it takes to perform the logging sub-procedures. It is better for the author to plot a curve to show the linearity.

nikorev commented 4 years ago

Reviewer: Niko Reveliotis Review Type: Comprehension

Problem Being Solved

LogSafe attempts to create an environment which would allow for easier cyber-forensic analysis of data attacks on IoT systems using a secure data logger. Many major services (such as AWS) don't disclose the details of their infrastructure for proprietary reasons; this paper discusses how evaluating their security systems is difficult. LogSafe is an additional cloud-based option for IoT developers to use to handle data logging/collection.

Three Questions

1) This paper heavily leverages the use of Intel's Software Guard Extensions as one of their security measures. Unfortunately there we're no critical review slots left, but I'd consider limiting to Intel to be a critique. Also later in the paper they also state Linux not having full support of SGX (making this limitation to SGX worse). Is there an equivalent to SGX for AMD? Recent benchmarks of consumer-grade processors have begun to tip the scale of performance towards AMD, and if users have to choose between speed and security for management of these data logging services I feel the speed will be selected more often than not.

2) This paper refers to the lack of a full-proof solution to a DDOS attack. They say they can alleviate the issue by "requiring attackers to compromise many more machines in order to disrupt the functionality of LogSafe". Is there any sort of model that can determine the amount/percentage of machines needed to be taken offline to disrupt a system of linked nodes sharing computation? For example, you have 10 machines that offload workloads to each other. How many of these machines need to be taken down before performance is significantly reduced? Does this vary by implementation of cloud-computing solutions? Or even by location of the systems to each other?

3) Looking at the average performance time in figure 5, the cloud logger using SGX performs significantly worse than LogSafe. Going back to the lack of compatibility of SGX, are there other secure solutions on the processor level that have been examined? LogSafe seems to try and implement SGX in a more efficient way.

samfrey99 commented 4 years ago

Reviewer: Sam Frey Review Type: Comprehension

Problem: IoT devices generate massive amounts of data about the personal lives of their users, but not all of these devices are designed with security in mind. Because IoT devices are prone to far more cyber-physical attacks than the average personal computer, it must be insured that communication a storage of this data is managed in a safe, secure way. Loggers that satisfy this condition exist for those running their own servers, but not for those using a cloud-based or distributed infrastructure.

Contribution: The authors propose LogSafe as a solution to the problem presented above. LogSafe uses Intel's Software Guard Extensions to ensure that a cloud application storing user data can remain protected even if a hacker has access to a physical device communicating with the server.

Questions:

The paper very briefly mentions that SGX doesn't support I/O operations. If enough nodes received sufficient I/O requests to impact their performance, could this potentially compromise the system as a whole?
Because LogSafe is distributed, could moving LogSafe nodes to the Edge further improve the performance of data logging for cloud based systems?
Niko brought up the project's reliance on Intel's SGX, but the paper also only mentions testing logging performance on one machine (Dell Latitude 5480, a laptop). Could this have impacted the performance results? I am certainly not an expert in the area of distributed systems, but this seems limiting. They mention that 3 LogSafe nodes can handle 5000 devices. What about millions?

gkahl commented 4 years ago

Reviewer: Greg Kahl

Review Type: Comprehensive

Problem

Many IoT devices record large amounts of data about people's personal information which when analyzed can provide sensitive information on the individuals, such as their address and times that they are home or away from home. Although that this information might not be explicitly recorded by analyzing logs from their home IoT device, unauthorized people can gain a lot of sensitive information on people

Contribution

LogSafe is proposing the design and implementation of a distributed and secure cloud based logger using SGX to ensure confidentiality, integrity, and availability. By developing a secure logger for IoT devices, the logs of potentially sensitive information harvested from various IoT devices can remain secure from other devices or users. They are using SGX to enclaves to run code securely on untrusted hardware.

Questions

1 - How exactly do the SGX enclaves work? They briefly discussed how they use the EENTER and EEXIT commands, but I didn't understand how exactly it protects against untrusted users in possession of the hardware from accessing their memory. 2 - When establishing a connection between the IoT node and the Logger node, if there is already sealed device meta data, it is unsealed and put into system memory. Is this a security threat? Will other processes be able to read this unsealed meta data to gain access to the secure logger with the encryption key? 3 - Would a malicious node be able to disable the system by sending requests just to increase the monotonically increasing sequence node? Or maybe another malicious process on the untrusted hardware?

chandaweia commented 4 years ago

Reviewer: Cuidi Wei Review Type: Critique

Problem being solved This paper addresses the problem of how to design and implement a distributed cloud-based logger for IoT devices using SGX. The logger must satisfy the CIA properties in the presence of eavesdropping, injection, and replay attacks.

Main contributions This paper proposes a design and implementation of LogSafe, a cloud-based secure, scalable, and fault-tolerant logger that can accommodate a large number of IoT devices and a scalable snapshot algorithm to defend against attacks without compromising the logger’s performance. Moreover, this paper evaluates the performance of the logger implementation to test its scalability properties.

Critiques about the paper

How to keep the storage efficient and prevent the storage from overhead? Permanent storage may be effective for authentication, but if there is too much data, it will incur much more overhead.
Encryption safe is doubtful because once the encryption way is leaked, the communication will never be secrete. 3.How about the latency of the LogSafe?

reesealanj commented 4 years ago

Reviewer: Reese Jones Review Type: Critical Review

Problem Being Solved:

This paper discusses the problems and obstacles facing the development of loggers for cataloging the storage of information generated by IoT Devices. The paper describes a method for implementing such a logger which supports the usage of Intel Software Guard Extensions and follows the guidelines of Confidentiality, Integrity and Availability.

Main Contributions:

The authors present a solution to the IoT device logger problem called LogSafe. They claim that LogSafe adheres to both SGX and CIA standards and is a faster, more reliable (fault-safe) method of efficiently and effectively log traffic across the Internet of Things

Questions:

It is mentioned that around 5,000 devices can be serviced with 3 LogSafe nodes, would that number scale linearly as you expand to larger and larger networks of devices?
Following the first question, is there a point wherein there is an overhead cost that makes the LogSafe implementation no longer the best option for logging? (Although if I understand correctly, LogSafe is the only practical option so this may be irrelevant)
Is there a prohibitive processing latency for logging all of the data and storing it in the cloud? I know that this seems to be the best way to perform this technique but does it have any sort of impact on the function of the individual IoT device and its functional performance?

Critiques:

In the system overview section the paper references a deployment for a number of devices in the billions range but as I said before the example deployment they listed was 3 LogSafe nodes to 5,000 devices.
Although the paper addresses that there could be a situation wherein a LogSafe node is DDoS'ed and loses connection to a number of devices I was not satisfied with their explanation of how they protect from having that point of failure.

albero94 commented 4 years ago

Reviewer: Alvaro Albero Review Type: Critical

Problem being solved With the increase use and penetration of IoT the amount collected of private data is rapidly increasing. Many of the IoT devices available are generally not secured and the data they are working with is exposed to many different threats.

Main Contributions The paper proposes LogSafe as a solution for this problem. Assuming that making the devices secure enough is not feasible, LogSafe is a logger system that allows to perform forensic operations after and attack has occurred. LogSafe uses SGX trusted software for confidentiality, integrity, and availability. It ensures highly scalability and fault tolerance as It is cloud based, without losing control of the system.

Questions

Is the manager a single point of failure? Could attacks focus on this device as it is considered a trusted device?
The scenarios where snapshots are created is limited due to the overhead of monotonic counter. Is the number of snapshots created enough to maintain integrity or reducing them to avoid overhead limits the integrity of the system?
Do all IoT devices use a public and private key? Is this assigned from the manufacturer or the users of the devices?

Critiques

The system explained in this paper and the results obtained are very interesting but I find the abstract misleading. One of the points they emphasize in the abstract, introduction and conclusion is the ability to perform forensic analysis on attacks targeting the data. What I see afterwards is a really good explanation on why the system is secured and how it satisfies CIA properties but I do not see an explanation talking about that forensic capabilities, the benefits of it and how attacks are detected and addressed.
Section 4G talks about how the system information is secured once it gets to the cloud but how about if the IoT devices have been compromised. The paper mentions how the data is protected against malicious nodes trying to access it but it only mentions one reference and no explanation on how an attack on an IoT device that can send corrupted information is detected and addressed. Again, this capability is highlighted in the abstract, introduction and conclusion and not really explained throughout the paper.
LogSafe uses Hashchain that is a similar approach to integrity then Bitcoin and blockchain developed a few years later. However, the Hashchain reference paper admits that their solution is not complete and some events could be hidden from the timelines. They mention that they plan to alleviate that problem by using an approach to the Byzantine Generals Problem. Bitcoin and blockchain were able to solve this problem so maybe they could have used a more recent reference to tackle integrity.

bushidocodes commented 4 years ago

Reviewer: Sean McBride

Review Types: Critique

Problem:

Security logs traditionally play an important role in understanding what has recently occurred on network systems and performing cyber forensics after the fact to understand the nature of an attack. IOT is a fast growing area, but it has traditionally under-invested in security and now is a large attack target. Additionally, there is little visibility into this area because of poor logging, so there is little understanding of the nature of these attacks and how IOT might act as the soft underbelly of larger attacks.

Contribution

The authors create LogSafe, a logging system that can run on top of untrusted cloud computing infrastructure, by using SGX. It uses a hashchain and only uses the Intel Monotonic Counter in contexts where the performance penalty isn't too expensive (Tracker versus Logger).

Questions

In Figure 4, why does performance seem to improve as more IOT devices push logs? This seems opposite. Additionally, wouldn't performance have to degrade at some point under heavy load?
Why is it that Windows can interact with physical monotonic counters, but Linux cannot? Has this changed since publication? Why are physical monotonic counters so slow? Is this something that might improve in the future?
How does a hashchain work?

Critiques

The paper seems very disingenuous. LogSafe appears to be a general-purpose logging service designed to run in an untrusted cloud. This is the primary value add, and there is no obvious connection between the system architecture and IOT. I strongly suspect the researchers built the system and then rebranded as IOT for grand or publication purposes. The authors handwave around fitness trackers and other such devices, but they assume Intel Edison devices running Linux (or better yet, Windows IOT) plugged into a wall. I doubt this scales down to real-world IOT devices. The evaluation then seems surprised by the poor performance of their system on even the Intel Edison, a relatively beefy IOT device. This smells like reuse of an existing codebase with little to know systems thinking for the domain.
It is also unclear to me what the situations are where a customer would use untrusted cloud services. Per NIST, cloud is a shared-responsibility model built on trust. Why use cloud at all then if you are unwilling to accept this? Additionally, the architecture of having a Manager outside of the cloud in an Equinix colo data center with a DirectConnect to the cloud is EXTREMELY EXPENSIVE, especially accounting for the fact that CSPs bill you for all outgoing network traffic. This approach also seems to make the Manager a large single point of failure.
The use of a DHT algorithm like Chord seems odd if only studying 1..3 nodes. Distributed algorithms like this are usually highly dynamic and scale up really high. I would have expected the evaluation to study throughput while scaling up the logging nodes to 50 nodes+. Additionally, when a chord node joins, ownership for parts of the namespace would change. I would assume that this would force the IOT device to perform another complex handshake. This should have been considered in the eval section.

zacharied commented 4 years ago

Reviewer: Zach Day Review type: Comprehension

Problem

In the IoT sphere, attacks on personal data have become more and more frequent as the number of devices has increased. While actively preventing the exposure and tampering of data remains an active research field, there has been significantly less work done in storing data in logs for posthumous analysis of attacks.

Contribution

LogSafe proposes a secure, reliable, verifiable system for IoT devices to record connection statistics and other network heuristics. They utilize Intel's SGX, a new set of instructions that enable creating a confidential & trusted environment within the system, protecting applications from tampering even if the attacker has physical access to the machine. LogSafe provides a number of reliability features that ensure logs are up to date and do not get overwritten by backups desyncing from the main log, guarding against replay attacks.

Questions

The authors do not mention the programming overhead that would be incurred for applications developed to use LogSafe. I assume special considerations need to be taken to make the code on the IoT device communicate with LogSafe nodes, so I'd like to see the workload of that on programmers.
The author mentions that I/O is impossible from within an SGX enclave. The paper claims that LogSafe is secure even when attackers have access to host nodes. My concern is with how these two factors interact with each other -- even if the hashchaining algorithm allows for verification of data integrity, couldn't an attacker just interfere with the output of the log to disk, preventing the log from being written at all?

s-hanna15 commented 4 years ago

Reviewer: Sam Hanna Review Type: Critical

Problem Being Solved: This paper talked about LogSafe, a secure logging software to log data for audits and forensics in case of vulnerabilities. The number of IoT devices is increasing and they are not being built for security, so they are bound to be exploited. LogSafe is designed with the key CIA (Confidentiality, Integrity, Availability) security triad in mind to find devices that have been exploited and help with forensics.

Important Areas: LogSafe focuses on providing a safe and scalable logger for forensic uses. They do this by utilizing the SGX architecture and the cloud. They focus on security and ensure that there are confidentiality and integrity throughout with encryption and hashing schemes in place. The logger uses nodes to decentralize the system and provide some assurance against availability attacks and provide for multiple nodes at the same time.

Questions:

What would LogSafe cost for the IoT devices using them? Would it really be feasible for a lot of IoT devices to be using this? What incentive do they have to pay the cost of a logger, when they don’t have any incentive to provide security as is?
Does the system actually provide anything for forensics, or does it just log and store all of the data?
Are there any privacy concerns with using LogSafe, if an IoT device used this software is it possible that LogSafe would share the data that is stored with Investigators or others without the consent of the device per the third-party doctrine?

Critiques:

They generally do a good job talking about security, and they do describe basically how the SGX works, but I think it is still kind of hard to get a solid understanding of what it is and how it works within the system. I also think the differences between the figure 1 system description and the figure 2 system description are vast and confusing.
They talk in-depth about security and scalability and those are both vital features. They don’t talk about the uses of this nearly enough, I am still unsure what type of devices would be using this. I think talking about the use cases would help to round out the paper and give a better high-level overview of what it can do.
They talk about how it is impossible to make all devices secure, so they should log the data instead. They do a good job of thinking through security concerns, but attackers will always find a way in, that is true. To me, this seems like a security risk of having all the data in one place, if there is one vulnerability in some part it could lead to all of the devices using it being compromised.

Others commented 4 years ago

Reviewer: Gregor Peach

Review Type: Critical

Problem

There are a multitude of IoT devices running on various networks. These devices tend not to be the most secure things. In order to provide better security, it'd be good to have accurate logs so we can analyze attacks after the fact.

Contribution

They suggest LogSafe a system for logging. It's cool because it provides the following three properties: 1) Confidentiality: Unauthorized access to the logs is impossible 2) Integrity: The logs cannot be modified or deleted by an unauthorized individual 3) Availability: The logs are always available for introspection by authorized individuals

This is especially impressive, since this is all running on the cloud -- and they assume that that system is not fully trusted either.

Questions

1) Does hashchain = blockchain? 2) Can multiple tenants securely share the LogSafe infrastructure? 3) Is there integration of "blockchain" and "SGX" more generally useful?

Critique

1) They mention side channel attacks on SGX in brief -- how serious is this in a Spectre world? Feels like this is a flaw in their security model (maybe not at time of writing) 2) This seems like a super complex infrastructure, and a solution in search of a problem. In today's world, can't most applications just send their logs to a trusted third party? (Who then just has a simple authentication system?) 3) Maybe I'm missing something, but couldn't you just put a public key on each device, then just encrypt and send logs to some sort of reliable block storage? (assuming you even need to run your own log infrastructure)

rachellkm commented 4 years ago

@huachuan, Huachuan Wang, Critical: Exactly how large is the overhead of Device Negotiate?
@nikorev, Niko Reveliotis, Comprehension: Is there an equivalent to SGX for AMD?
@nikorev, Niko Reveliotis, Comprehension: Is there a model that can determine the percentage of machines needed to be taken offline to disrupt a system of linked nodes sharing computation?
@samfrey99, Sam Frey, Comprehension: If enough nodes received sufficient I/O requests, could this potentially compromise the system as a whole?
@samfrey99, Sam Frey, Comprehension: Could moving LogSafe nodes to the Edge further improve the performance of data logging for cloud based systems?
@gkahl, Greg Kahl, Comprehension: How exactly do the SGX enclaves work?
@gkahl, Greg Kahl, Comprehension: If there is already sealed device meta data, it is unsealed and put into system memory. Will other processes be able to read this unsealed meta data?
@chandaweia, Cuidi Wei, Critical: Is there a method to efficiently store logged data?
@reesealanj, Reese Jones, Critical: Is there a prohibitive processing latency for logging all of the data and storing it in the cloud?
@albero94, Alvaro Albero, Critical: Is the manager a single point of failure?
@albero94, Alvaro Albero, Critical: Is the number of snapshots created enough to maintain integrity
@bushidocodes, Sean McBride, Critical: In Figure 4, why does performance seem to improve as more IoT devices push logs?
@bushidocodes, Sean McBride, Critical: Why is it that Windows can interact with physical monotonic counters, but Linux cannot?
@bushidocodes, , Sean McBride, Critical: How does a hashchain work?
@zacharied, Zach Day, Comprehension: Even if the hashchaining algorithm allows for verification of data integrity, couldn't an attacker interfere with the output of the log to disk?
@s-hanna15, Sam Hanna, Critical: Does the system actually provide anything for forensics, or does it just log and store all of the data?
@Others, Gregor Peach, Critical: Can multiple tenants securely share the LogSafe infrastructure?

Shared concerns/questions:

How will LogSafe handle abundant I/O requests and how does this affect overall system performance?
The paper claims that 3 logger nodes can service 5000 devices. How well would this scale?
People mentioned the lack of tools provided directly by LogSafe to perform forensic analysis or lack of an explanation of how LogSafe would handle detecting faulty data sent by a malicious device.