riscv / riscv-cheri

This repository contains the CHERI extension specification, adding hardware capabilities to RISC-V ISA to enable fine-grained memory protection and scalable compartmentalization.
https://jira.riscv.org/browse/RVG-148
Creative Commons Attribution 4.0 International
47 stars 28 forks source link

clarify malformed cap handling #285

Closed tariqkurd-repo closed 3 months ago

tariqkurd-repo commented 3 months ago

More clarification on malformed capability handling, following the discussion in https://github.com/riscv/riscv-cheri/pull/279

cmv, lc, sc, amoswap.c propagate malformed caps caddi, cadd, scaddr, scbnds* de-tag gchi, gcperm ignore gcbase, gclen return 0

also if the permission can't be created by ACPERM then clear all permissions, which affect load/store/cbo/amo dereferencing and also the target capability for JALR

jamie-melling commented 3 months ago

Will we change SCSS to return false if either cs1 or cs2 is malformed? Also, I think we need to include a malformed check on cs1 for CBLD as then you could use that to get infinite bounds.

tariqkurd-repo commented 3 months ago

Will we change SCSS to return false if either cs1 or cs2 is malformed? Also, I think we need to include a malformed check on cs1 for CBLD as then you could use that to get infinite bounds.

yes - I've fixed both of these

jamie-melling commented 3 months ago

Did we also mention that if you try to dereference a capability with malformed bounds, it would give a length exception? I think it is implied by the sentence that malformed bounds are treated as 0, but it would be good to have it explicitly stated.

tariqkurd-repo commented 3 months ago

Did we also mention that if you try to dereference a capability with malformed bounds, it would give a length exception? I think it is implied by the sentence that malformed bounds are treated as 0, but it would be good to have it explicitly stated.

I've added that statement into all load/stores/atomics

PRugg-Cap commented 3 months ago

The other thing we haven't talked about is CSRs! May require some thought.

tariqkurd-repo commented 3 months ago

The other thing we haven't talked about is CSRs! May require some thought.

so - what happens if e.g. Xtvec is malformed and we take an exception? maybe we can say that we detag on CSR write if malformed, reserved bits are set or permissions are bad

PRugg-Cap commented 3 months ago

The other thing we haven't talked about is CSRs! May require some thought.

so - what happens if e.g. Xtvec is malformed and we take an exception? maybe we can say that we detag on CSR write if malformed, reserved bits are set or permissions are bad

That would be my instinct... Presumably it's not too expensive, and should avoid us having to think about lots of confusing cases.

tariqkurd-repo commented 3 months ago

The other thing we haven't talked about is CSRs! May require some thought.

so - what happens if e.g. Xtvec is malformed and we take an exception? maybe we can say that we detag on CSR write if malformed, reserved bits are set or permissions are bad

That would be my instinct... Presumably it's not too expensive, and should avoid us having to think about lots of confusing cases.

The other thing we haven't talked about is CSRs! May require some thought.

so - what happens if e.g. Xtvec is malformed and we take an exception? maybe we can say that we detag on CSR write if malformed, reserved bits are set or permissions are bad

That would be my instinct... Presumably it's not too expensive, and should avoid us having to think about lots of confusing cases.

The other thing we haven't talked about is CSRs! May require some thought.

so - what happens if e.g. Xtvec is malformed and we take an exception? maybe we can say that we detag on CSR write if malformed, reserved bits are set or permissions are bad

That would be my instinct... Presumably it's not too expensive, and should avoid us having to think about lots of confusing cases.

done - try this!

PRugg-Cap commented 3 months ago

Looks good now: certainly clears up any ambiguity! I see you're going for checking the reserved bits are clear, which is fine but I don't think it's necessary. I guess it's cheap at least.

jrtc27 commented 3 months ago

Why are we putting so much effort into specifying what happens if the impossible happens? If you observe memory corruption, all bets are off anyway, your system is no longer working. Specify that it should be impossible to create a malformed cap; if you then observe one due to an implementation or vendor extension bug, it’s in violation of the spec, and if you observe one due to a random uncorrected bitflip then your system has failed. I am concerned that by putting all kinds of specifications on what happens when the impossible happens we’re complicating implementations by adding logic to deal with cases that are, in the formal specification, unreachable. As an example, RISC-V has all the WARL fields in CSRs, and doesn’t go on to say what happens if a bit flip turns the field into an illegal value. It just assumes that won’t happen, and leaves it completely unspecified if it does.

PRugg-Cap commented 3 months ago

@jrtc27 (See discussion here). I agree we could probably have left it unspecified, though a bitflip or reading uninitialised memory seem significantly more likely than a CSR bit flip. I don't think anywhere else is there an invariant the spec tries to maintain over all of memory.

cases that are, in the formal specification, unreachable.

This does have to be addressed in the formal specification, at least as an assert. One practical issue others (e.g. Dapeng) have run into is that it's difficult to prove equivalence if an implementation differs from the spec on this. Since the implementation does have to do something, we may as well standardise what that something is. We could go less strict though.

tariqkurd-repo commented 3 months ago

Why are we putting so much effort into specifying what happens if the impossible happens? If you observe memory corruption, all bets are off anyway, your system is no longer working. Specify that it should be impossible to create a malformed cap; if you then observe one due to an implementation or vendor extension bug, it’s in violation of the spec, and if you observe one due to a random uncorrected bitflip then your system has failed. I am concerned that by putting all kinds of specifications on what happens when the impossible happens we’re complicating implementations by adding logic to deal with cases that are, in the formal specification, unreachable. As an example, RISC-V has all the WARL fields in CSRs, and doesn’t go on to say what happens if a bit flip turns the field into an illegal value. It just assumes that won’t happen, and leaves it completely unspecified if it does.

there's a big difference between the RISC-V spec which has a lot of unspecified behaviour, and this spec. What we could do is say that malformed bounds etc. cause implementation defined results in the open source spec, and then all these extra details go into a company internal spec doc, but we've chosen to specify the cases in the open source.

it's also very useful - because a faulty SoC could return a tag with incorrect data in a weird error case - and we're tieing down what would actually happen in that case.

it's also true that alpha particle strikes causing bit flips, or voltage dropout when the vector unit gets going and draws a lot of current are more common than you might think, which is why all the server chips have RAS, ECC etc. features. When these errors do occur, it means we've defined what he behaviour should be.

of course if an error causes a tagged cap to have different bounds then we still wouldn't know - at that point we'd need to start adding ECC to the buses to detect it, which is definitely out of scope, but yuou might that that for RV64 you could use some of the reserved fields for ECC checking the bounds and address. That could be a future extension.

tariqkurd-repo commented 3 months ago

@jrtc27 (See discussion here). I agree we could probably have left it unspecified, though a bitflip or reading uninitialised memory seem significantly more likely than a CSR bit flip. I don't think anywhere else is there an invariant the spec tries to maintain over all of memory.

cases that are, in the formal specification, unreachable.

This does have to be addressed in the formal specification, at least as an assert. One practical issue others (e.g. Dapeng) have run into is that it's difficult to prove equivalence if an implementation differs from the spec on this. Since the implementation does have to do something, we may as well standardise what that something is. We could go less strict though.

this is the other useful aspect - for formal environments it's very hard to constrain the input to be valid tagged caps only. With this spec we can simply have any input and a deterministic output.

PRugg-Cap commented 3 months ago

@jrtc27 The other sneaky advantage of this approach is that we maximise the chance of being forwards compatible (security-wise) with another extended CHERI implementation on the same SoC.