SWI-Prolog / roadmap

Discuss future development
21 stars 3 forks source link

Spectre attacks mitigation #54

Closed triska closed 4 years ago

triska commented 6 years ago

Spectre attacks exploit speculative execution to leak secrets via side-channels.

These attacks are a serious security concern for SWISH, since untrusted code may gain knowledge about secret values (example: private keys) of the running process.

In particular, what is also known as Spectre "Variant 1" potentially affects all applications in which sensitive data may be speculatively accessed, especially those that also run untrusted code.

There are many indirect branches and indirect jumps in the SWI-Prolog source code that can be trained to access secret information upon speculative execution. Observing the cache state by timing certain operations may leak the values that have been tentatively loaded into the cache.

One way to mitigate such attacks is to disable speculative execution. See for example the lfence instruction.

An ad hoc and flawed countermeasure is to make high-resolution timers that can be accessed from untrusted code more coarse, or adding jitter to them. This is flawed because there may be other ways to gain accurate timings, and also because stochastic resonance may still allow conclusions to be drawn in the midst of noise.

See also https://github.com/v8/v8/wiki/Untrusted-code-mitigations.

This is related to https://github.com/SWI-Prolog/packages-http/issues/98.

JanWielemaker commented 6 years ago

If I understand the V8 link correctly (and also my (not very detailed) mental model), Spectre does not affect process isolation. Meltdown does, but is handled at OS level. This would mean we are only vulnerable to data leaking between threads, right? Next is the question how serious this is for (SWI-)Prolog? I doubt there is speculative execution accessing the stacks of other threads as there are no pointers from the shared program area to any of the stacks and no pointers from the thread local data to the stacks of other threads. It should be possible to snoop in the program space and examine clauses that contain sensitive information. Clauses are not at fixed addresses though and you need quite a few steps to find them (get the global module table, find the module, get the module predicate table, find the predicate, follow the clause list). That is a fair number of steps where you need to find a speculative execution path that you can realistically time for each of these involved addresses. Is that realistic? Do I miss something that could lead to a more direct attack?

Reducing time resolution for SWISH would be fairly simple and as SWISH only has access to white listed predicates it should not be possible to work around that unless you break the SWISH sandbox. In that case you can access any predicate you want way easier ...

triska commented 6 years ago

It should be possible to snoop in the program space and examine clauses that contain sensitive information.

I think this is the most critical issue. Sample use case: Read bytes of the server's private key, which resides in the same process.

If an attacker finds a suitable gadget in the core engine, essentially any part of the process's memory can be read. For example, as one tiny building block of a potential attack, consider the following fragment:

https://github.com/SWI-Prolog/swipl-devel/blob/5acbe770402197c1f4e2da55bc7eeff6bc18ea8c/src/pl-prims.c#L2216

If a branch predictor learns that idx is typically smaller than or equal to arity, then the next instructions may be speculatively executed. From this, you may force any memory location (or rather: cache line) of the current process to be cached:

  1. use a sequence of arg/3 calls where the branch predictor is trained to execute the desired branch
  2. finally, use a single arg/3 invocation where the index is out of bounds.

This already leads to a micro-architectural side effect which could lead to measurable timing differences of further operations.

This is only from a cursory glance at the source code, and this particular instance may not even be exploitable. However, there are many similar situations in the source code where attacker-controlled data may lead to measurable micro-architectural side effects if the code is speculatively executed.

Reducing timing resolution is an ad hoc measure that does not count as a true solution. For example, in the JavaScript-based Spectre attack, a simple worker thread that incremented an integer provided sufficiently high resolution to perform the attack.

As one mitigation of the particular case above, you can use serialization instructions to prevent speculative execution of the branch before the condition is fully evaluated.

JanWielemaker commented 6 years ago

Good catch. It is kind of unlikely we are able to clean the entire code base from cases like this though. I guess the timer may be ad-hoc, but is otherwise the most realistic countermeasure. Note that SWISH doesn't allow users to run additional threads. In the long run I'd hope there will be automated techniques to either fully mitigate this or at least point out potential attack points. If not, I fear that the idea of keeping secrets inside a process is doomed ...

triska commented 6 years ago

A new article from June 2018 outlines a few adapted attacks:

Overcoming (some) Spectre browser mitigations

Conclusions:

This research shows that while the timing mitigations implemented in different browsers are effective at dramatically slowing down Spectre-like attacks, they are not effective at preventing them. This means that more robust solutions are required, such as site-isolation and index masking. These timing mitigations are hurting performance and functionality for some web applications, and taking into account their limited effectiveness, reverting them should be considered.