google / safeside

Understand and mitigate software-observable side-channels
BSD 3-Clause "New" or "Revised" License
498 stars 54 forks source link

Renamed file and updated naming scheme to also include names used in academia. #12

Open cc0x1f opened 5 years ago

cc0x1f commented 5 years ago

Renamed spectre_v1_indirect_jumps.cc to spectre_v2.cc as this is actually an implementation of Spectre-V2/Spectre-BTB, not Spectre-V1/Spectre-PHT. This can be verified by adding an lfence before line 137. If it was indeed a Spectre-PHT attack, speculation would end here but instead the data is still leaked. By adding the lfence before line 51 we can actually prevent leakage. Therefore, we mistrain the Branch Target Buffer, making it a v2/BTB attack.

Also updated the names used in the demos/README.md file to also include the names used by academia. This change is mostly to keep names across industry and academia consistent.

googlebot commented 5 years ago

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

:memo: Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

cc0x1f commented 5 years ago

@googlebot I signed it!

googlebot commented 5 years ago

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

cc0x1f commented 5 years ago

@googlebot I fixed it.

googlebot commented 5 years ago

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

chandlerc commented 5 years ago

I would need to re-read the paper to discuss what BTB vs PHT mean, but I want to specifically mention why I believe that the current demo should be classified with variant 1 and not variant 2 of Spectre.

Variant 2 of Spectre has an important distinct property from variant 1: it involves injecting a target into the predictor that no correct execution of a branch could possibly reach. This is extremely different from simply training the predictor to go to one valid target, and then leveraging this trained target to trigger a misprediction.

The original naming from Intel also specifically highlights this: Branch Target Injection (BTI).

This is an extremely important distinction for a number of reasons:

1) It is impossible to conclude through static analysis of the program whether or not there is a variant 2 findable leak gadget because the injected target may be to an instruction stream not in the program source (similar to ROP-gadgets). 2) Mitigating variant 2 in software requires blocking misspeculation of the branch (retpoline, even LFENCE isn't perfect due to a race). Whereas mitigating a mistrained (but not injected) target for an indirect branch can be done in many other ways (and in fact SLH implements one of these other mitigations). 3) The Intel microcode mitigations for variant 2 only mitigate against injection, not against mistraining.

This is why I think it is really important to classify this as variant 1. Hope that explanation helps.

misc0110 commented 5 years ago

After checking the code, I have to agree with @cc0x1f here. This is an instance of Spectre-BTB (aka Spectre variant 2). I disagree with your distinction. The distinction is based on which predictor is exploited (see [1], [2], [3]). Spectre-PHT (aka Spectre variant 1) targets the Pattern History Table which predicts the outcome of a conditional branch [2]. Spectre-BTB targets the Branch Target Buffer which predicts the location of an indirect call or jump. Similarly, Spectre-STL (aka Spectre variant 4) targets the memory disambiguation for predicting Store To Load (STL) data dependencies, and Spectre-RSB (aka Spectre variant 5) targets the Return Stack Buffer predicting the return address.

In all Spectre variants, the branch predictor is manipulated (which you call injecting). For both Spectre-PHT and Spectre-BTB, there are 4 different variants of injecting a different prediction (see [2], Figure 3). In this specific case here, the BTB is manipulated in-place in the same address space (specifically Spectre-BTB-SA-IP in the academic naming convention). This version of Spectre-BTB is especially nice, as it is so easy to understand and reproduce as PoC. This is why we always use it in slides to illustrate Spectre-BTB (e.g., [4], slide 26 (page 184)), and we also have it in our repository ([5]).

Regarding your reasons:

  1. I agree that Spectre-BTB is much more powerful in the sense that you can use it ROP style. And in this PoC, you can also do that. The PoC contains an indirect jump/call which can also be influenced from a different application, jumping to an arbitrary destination in the PoC.
  2. In-place (what you call mistrained) and out-of-place (what you call injected) require different mitigations. Still, the underlying mechanism is the same, only the way to exploit it differs, and the current mitigations mainly prevent the exploitation, not the underlying mechanism.
  3. That is not entirely correct. They only prevent injection from a different privilege level. One user-space application can still manipulate the branch prediction for a different user-space application. It also works inside the same application (think of JS sandboxes), this is not prevented by the microcode updates. Preventing in-place exploitation is a lot more difficult.

To conclude, this PoC is, as @cc0x1f said, Spectre-BTB not Spectre-PHT.

[1] https://spectreattack.com/spectre.pdf [2] https://misc0110.net/web/files/transient_execution_attacks.pdf [3] https://msrc-blog.microsoft.com/2018/03/15/mitigating-speculative-execution-side-channel-hardware-vulnerabilities/ [4] https://misc0110.net/web/files/beyond_belief.pdf [5] https://github.com/IAIK/transientfail/tree/master/pocs/spectre/BTB/sa_ip

chandlerc commented 5 years ago

After checking the code, I have to agree with @cc0x1f here. This is an instance of Spectre-BTB (aka Spectre variant 2).

Spectre-BTB is much more broad than "Spectre variant 2" as it is used pervasively in the rest of the industry. "variant 2" is a subset of "Spectre-BTB" which is all I've been suggesting. I'm not saying anything about "Spectre-PHT" here...

And jumping a head a bit:

To conclude, this PoC is, as @cc0x1f said, Spectre-BTB not Spectre-PHT.

I'm not arguing about this classification as I've tried to say several times. I think in this regard, we are talking past each other a bit.

My firm position from working with all of the HW vendors and other parties as part of the cross-industry mitigation effort is that "variant 2" refers very specifically to the core issue of CVE-2017-5715 which Intel describes as "Branch Target Injection". The injection combined with the BTB are the two critical components for something to be "variant 2". Using the academic classification you have proposed, this is Spectre-BTB-CA-OP.

I do not think that "variant 2" should be used to refer to any other issues. I am very happy to provide a mapping from the classification you use in the research paper, but I is essential that "variant 2" only refer to the injection into the BTB.

The reason why this is so crucial is that there are mitigation techniques that are specifically documented as "mitigating variant 2" which only mitigate the injection into the BTB. They do not mitigate the issue in this demo! We cannot call this "variant 2" without creating serious and harmful confusion among those deploying mitigations. And we cannot go change that documentation to use a different classification scheme.

The specific mitigation strategy I refer to here is the use of IBRS and/or STIBP in conjunction with operating system changes to prevent cross-address space BTB injection. We have strong reasons to believe this is the most critical aspect to mitigate, and we have strong mitigations widely available and well documented as handling "variant 2".

I hope this makes clear why I am insisting on this narrow definition of the term "variant 2".

I want to again note that I'm not disagreeing that this demo is classified in a related space exactly as you indicate:

In this specific case here, the BTB is manipulated in-place in the same address space (specifically Spectre-BTB-SA-IP in the academic naming convention).

I 100% agree with this. I simply disagree that we can call this "variant 2". That will create active and harmful confusion as it is not mitigated by STIBP or similar hardware techniques.

I also wanted to respond to some of the other comments here:

Regarding your reasons:

  1. I agree that Spectre-BTB is much more powerful in the sense that you can use it ROP style. And in this PoC, you can also do that. The PoC contains an indirect jump/call which can also be influenced from a different application, jumping to an arbitrary destination in the PoC.

The demo isn't about what unrelated vulnerabilities might exist, it is about which specific one is demonstrated. The existence of an indirect branch doesn't make this a meaningful demo of any other variation on Spectre-BTB, it is just the one you identified: Spectre-BTB-SA-IP. If we want to demo the others, we should craft narrow and specific demos of them.

  1. In-place (what you call mistrained) and out-of-place (what you call injected) require different mitigations. Still, the underlying mechanism is the same, only the way to exploit it differs, and the current mitigations mainly prevent the exploitation, not the underlying mechanism.

Yes, exactly. But see above: the mitigations are documented to cover "variant 2", and due their narrow nature, it is essential we don't put other things into that set.

  1. That is not entirely correct. They only prevent injection from a different privilege level. One user-space application can still manipulate the branch prediction for a different user-space application. It also works inside the same application (think of JS sandboxes), this is not prevented by the microcode updates.

This is not accurate.

First, the mitigations I'm referring to are not a single flag, but a set of changes to HW and system software. It is the conjunction of system software changes and availability of IBPB/IBRS/STIBP (in different combinations based on HW generation) for Intel parts for example.

Second, while some have elected to not fully deploy these mitigations for user-space applications due to the severe performance impact on older HW parts, newer HW parts from Intel and other vendors do not have that problem. There, even user-space applications have "variant 2" (as I described it above) fully mitigated when combined with the system software changes.

Preventing in-place exploitation is a lot more difficult.

Agreed. This is specifically why I do not want it classified in any way as "variant 2" which is widely believed to be thoroughly and easily mitigated.

dgruss commented 5 years ago

I think the main argument was that this is not Spectre v1, which is clearly described as "bounds check bypass". Also see https://software.intel.com/security-software-guidance/insights/deep-dive-analyzing-potential-bounds-check-bypass-vulnerabilities Spectre v1 is clearly only Spectre-PHT. Also see https://software.intel.com/security-software-guidance/api-app/sites/default/files/336983-Intel-Analysis-of-Speculative-Execution-Side-Channels-White-Paper.pdf

So, calling anything that is not Spectre-PHT "Spectre v1" is misleading.

Better, because the v1,v2,... naming scheme is unclear to many, we should altogether avoid it.

chandlerc commented 5 years ago

I think the main argument was that this is not Spectre v1, which is clearly described as "bounds check bypass". Also see https://software.intel.com/security-software-guidance/insights/deep-dive-analyzing-potential-bounds-check-bypass-vulnerabilities

The original discussions with Intel and everyone else around this made it abundantly clear it wasn't just about bounds checks. No one is arguing "bounds check bypass" is good terminology.

I don't think "variant 1" vs. "Spectre-PHT" is more or less clear. I don't find "PHT" especially useful (there are too many forms of branch predictor for me to like naming after each one of them) and so I somewhat prefer the perhaps overly generic "variant 1".

Spectre v1 is clearly only Spectre-PHT.

Only on some microarchitectures which use a pattern history table? There are other branch predictors that are still susceptible to the same core idea...

Also see https://software.intel.com/security-software-guidance/api-app/sites/default/files/336983-Intel-Analysis-of-Speculative-Execution-Side-Channels-White-Paper.pdf

So, calling anything that is not Spectre-PHT "Spectre v1" is misleading.

Some find "variant 1" misleading. Some find the "PHT" taxonomy misleading....

Better, because the v1,v2,... naming scheme is unclear to many, we should altogether avoid it.

I don't think we will find a perfect naming scheme.

But I think we might simply have a nice table that tracks all of the variations here and explains how they relate to each other.

dgruss commented 5 years ago

The original discussions with Intel and everyone else around this made it abundantly clear it wasn't just about bounds checks. No one is arguing "bounds check bypass" is good terminology.

But that's what variant 1 was about.

I don't think "variant 1" vs. "Spectre-PHT" is more or less clear. I don't find "PHT" especially useful (there are too many forms of branch predictor for me to like naming after each one of them) and so I somewhat prefer the perhaps overly generic "variant 1".

But these branch predictors might need different kinds of mitigations.

Only on some microarchitectures which use a pattern history table? There are other branch predictors that are still susceptible to the same core idea...

Yes, other architectures might have a microarchitectural element with a different name serving the same purpose.

Better, because the v1,v2,... naming scheme is unclear to many, we should altogether avoid it.

I don't think we will find a perfect naming scheme.

But I think we might simply have a nice table that tracks all of the variations here and explains how they relate to each other.

My impression is that what you actually want is something like this: Screenshot_2019-09-10_10-25-34 You care about the in-place same-address-space variants, regardless which branch predictor that is (as shown on the slide). Neither v1 nor v2 express that, both mean other things. The only naming scheme (that I'm aware of) that allows to make this distinction is the one you call the academic one (though there is no alternative non-academic one), and there it would be Spectre-*-SA-IP.

ssbr commented 5 years ago

Hey sorry about the huge delay getting back to you, it was a really busy time of year. (Cppcon, performance review, ...) I wanted to sit on this reply to give others a chance to respond, but they didn't and I feel bad that I've left you hanging.

I really appreciate this issue being raised, since I also was super confused about the exact boundaries being drawn around Spectre v1 vs v2, and had a couple iterations of this kind of discussion in private without gaining as much understanding of the issue as this thread has brought me. So, thank you!

But that's what variant 1 was about.

This seems like a major part of the disagreement worth elaborating on. Calling it "bounds check bypass" is IMO definitely overspecifying it, and that term/concept is worth abandoning forever. The specific code pattern of a direct branch based on a comparison of index against array length, as opposed to any other direct branch, does not change how the vulnerability works or how it can be removed. The same vulnerability/mitigations will work whenever a memory address can be made to point to a secret by mistraining the direct branch predictor. (Much of the discussion here revolves around indirect branches, but at least direct branches are all the same microarchitecturally, right?) SLH works on more than just bounds checks, for example!

(We get into messier territory if we want to talk about predicted branches that lead to a secret, but not via a speculative buffer overflow. IDK, what do you call that?)

As an example, many months ago I modified our spectre v1 example in a local copy to instead never use (or, well, rely on the use of) an out-of-range index, not even speculatively. Instead I wanted to execute a speculative bypass of the condition below, the "short string optimization":

class MyShortStringOptimization {
  size_t size_;
  union Data {char* p; char s[sizeof(p)];};
  Data data_;

 public:
  size_t size() const { return size_; }
  const char& operator[](size_t n) const {
    if (size_ >= sizeof(data_.s)) {
      // is this so big we need to store in the heap instead?
      return data_.p[n];
    }
    return data_.s[n];
  }
};

It worked perfectly, and was prevented perfectly by SLH!

In this example, we can train the branch predictor to think that the string is short, and then pass it a string (and index) that is larger than sizeof(s), and get a speculative buffer overflow. This is true even if the index is always within bounds. This shows that from a certain point of view, it doesn't matter what the condition is, what matters is that the address is user-controlled so that they can manipulate it to point to the secret that they want to extract -- and they can do this without ever doing a bounds check bypass. SLH works on that generalized basis, and it's that generalized problem that we are interested in addressing.

But these branch predictors might need different kinds of mitigations.

This is a good point. I hope we do that as little as possible, though.

Even though indirect and direct branches are different on the CPU, they are not different in the C++ programming language (or any other high level programming language). Any if statement can become an indirect branch, and any virtual function call can become a direct branch -- not just the direct/indirect branches that we expect, respectively. As a result, we likely don't want to have separate mitigations for the two cases, even if we do need separate test cases. (Which are necessarily flawed for the same reason, in some way we might need to fix.)

For what it's worth, I think that the original naming scheme had a not-unreasonable basis, and mapped cleanly to our intent. The new naming scheme doesn't completely capture the right conceptual boundaries, and also is a lot of opaque letters. Also, it will be hecking merge conflicts. So I am -1 on renaming the file unless we have a really solid name everyone can agree is perfect.

The README change I am more sympathetic to, but I think we need a dedicated index rather than putting it in the README. There's too many overlapping names and this would get big, fast.

dgruss commented 4 years ago

I'm aware that SLH works on more than just direct branches (depending on the specific case). But a systematization of attacks should not be tailored to a specific defense, right? Otherwise we could tailor a systematization also to retpoline and it might not look more useful then either.

I agree that it doesn't make sense to group it by the bounds check bypass, but that's also not what we propose. But I would argue that it would be difficult to find a proper definition for Spectre v1 and Spectre v2 that captures all the things you want to mitigate under the umbrella term Spectre v1 and Spectre v2 respectively.

The same vulnerability/mitigations will work whenever a memory address can be made to point to a secret by mistraining the direct branch predictor. There are direct-branch Spectre gadgets that can be exploited that don't fulfill that property. For instance, leaking data from registers. Leaking data from a static memory address that cannot be influenced by the attacker.

And I agree, branches are not always directly exposed to the programming language. But the systematization of attacks should not be based on programming languages as that would mean that we need different systematization for every programming language!? The systematization on predictor and mistraining location is better as it is oblivious to the programming language and specific attack examples.