CHERIoT-Platform / cheriot-sail

Sail code model of the CHERIoT ISA
Other
34 stars 9 forks source link

Create backward sentries for function returns and add more checks in cjalr #54

Closed vmurali closed 6 months ago

vmurali commented 6 months ago

The following is based on discussions with @davidchisnall (portions of our email exchange have been included as is with perhaps slight modifications).

Background: A typical (non-malicious) function call is as follows:

Even if the callee code is bug-free, the caller can violate callee’s CFI in two ways:

How CHERI prevents CFI:

CHERIoT inherits all the above benefits. However, because CHERIoT allows manipulating the status of the interrupt through a function call (and function return), by encoding the interrupt type of the callee in the otype of the calling sentry and the interrupt type of the caller in the otype of the return-to-caller sentry, the following attack can occur: A caller calling an interrupt-disabling callee X can set the return address of the callee to be the same interrupt-disabling sentry to X. This means, the callee, on return will call itself all the while operating with interrupts disabled. This will lead to infinite repeated calls to X with interrupts disabled, violating availability.

This attack can be prevented in CHERIoT by adding two new "backwards-edge" sentries and adding more checks on cjalr:

otype 4: Backwards-edge, interrupts-disabled sentry otype 5: Backwards-edge, interrupts-enabled sentry.

cjalr with a non-cnull link register always creates otype 4 or 5, i.e. backward sentries. (Currently it always creates otype 2 or 3 (forward explicit interrupt control sentries), so this is just a slightly different constant, but in the same place).

cs1 cd Used for Valid otypes
cra cnull Function return Return sentries only (4, 5)
not cra cnull Tail call to sentry Unsealed or interrupt inheriting forward sentry (0, 1)
any not cnull Function call Unsealed or interrupt inheriting forward sentry (0, 1)
any cra Function call Unsealed or forward sentries (0,1,2,3)

(Table derived from @rmn30 's)

Note that this disallows tail calls optimization of functions that change interrupt state. (They’re currently disabled and there’s a FIXME in the compiler to enable them. The compiler change is now to remove the FIXME, and then document this for programmers.)

vmurali commented 6 months ago

@davidchisnall

rmn30 commented 6 months ago

It would be useful to add the following table to document valid arguments to cjalr for sentry calls:

cs1 cd Used for Valid sentry types
cra cnull Function return Return sentries
not cra cnull Tail call to sentry Interrupt inheriting forward sentry only
any cra Function call Forward sentries
any not cnull, not cra Invalid None

Unsealed destinations are not subject to these restrictions.

Writing this out has made me realise that at the moment your code permits use of backwards interrupt inheriting sentries in the case where cd is not null and not ra. @davidchisnall is that required or should we disallow it?

vmurali commented 6 months ago

It would be useful to add the following table to document valid arguments to cjalr for sentry calls:

cs1 cd Used for Valid sentry types cra cnull Function return Return sentries not cra cnull Tail call to sentry Interrupt inheriting forward sentry only any cra Function call Forward sentries any not cnull, not cra Invalid None Unsealed destinations are not subject to these restrictions.

Very nice!

Writing this out has made me realise that at the moment your code permits use of backwards sentries in the case where cd is not null and not ra. @davidchisnall is that required or should we disallow it?

Hmm, if cs1 is a backward sentry, then cd must be $cnull (the fourth check). Maybe your table doesn't capture it correctly?

davidchisnall commented 6 months ago

Writing this out has made me realise that at the moment your code permits use of backwards sentries in the case where cd is not null and not ra

I don't have very strong opinions here, but I'd lean towards disallowing it. The only use case for cjal[r] with a non-zero and non-link-register cd is in the code-compression things (e.g. spill lots of registers with a function call that uses ct0 as the link register). We don't currently use them, but we potentially could. If we did, we'd want to restrict everything that uses any link register to forward sentries or not sentries.

At some point, it would be good to have variants that must use sentries. For CHERIoT, we could use the unused register bit for this, but official RVI extensions cannot.

Given that sentries cannot use the offset field, it would be nice to use a 16-bit encoding for this.

rmn30 commented 6 months ago

Hmm, if cs1 is a backward sentry, then cd must be $cnull (the fourth check). Maybe your table doesn't capture it correctly?

Yes, I just realised that and updated the comment but there is still one case that slipped through: interrupt inheriting sentry with not null, not ra cd. Not sure whether this should be allowed or not.

rmn30 commented 6 months ago

At some point, it would be good to have variants that must use sentries. For CHERIoT, we could use the unused register bit for this, but official RVI extensions cannot.

Could require that cret (cs1=ra, cd = null) must use a return sentry and is not valid for unsealed cs1?

davidchisnall commented 6 months ago

Could require that cret (cs1=ra, cd = null) must use a return sentry and is not valid for unsealed cs1?

I don't believe there are any use cases that require cret to work with an unsealed thing, so I would be happy with that requirement. It prevents substituting an on-stack return address with an unsealed thing.

vmurali commented 6 months ago

Yes, I just realised that and updated the comment but there is still one case that slipped through: interrupt inheriting sentry with not null, not ra cd. Not sure whether this should be allowed or not.

Hmm, interrupt inheriting sentry should be like any forward sentry (i.e. has to have ra) or it can be null. We can force that.

vmurali commented 6 months ago

Could require that cret (cs1=ra, cd = null) must use a return sentry and is not valid for unsealed cs1?

I don't believe there are any use cases that require cret to work with an unsealed thing, so I would be happy with that requirement. It prevents substituting an on-stack return address with an unsealed thing.

👍

vmurali commented 6 months ago

Writing this out has made me realise that at the moment your code permits use of backwards sentries in the case where cd is not null and not ra

I don't have very strong opinions here, but I'd lean towards disallowing it. The only use case for cjal[r] with a non-zero and non-link-register cd is in the code-compression things (e.g. spill lots of registers with a function call that uses ct0 as the link register). We don't currently use them, but we potentially could. If we did, we'd want to restrict everything that uses any link register to forward sentries or not sentries.

At some point, it would be good to have variants that must use sentries. For CHERIoT, we could use the unused register bit for this, but official RVI extensions cannot.

Given that sentries cannot use the offset field, it would be nice to use a 16-bit encoding for this.

I want to say this discussion is moot because of an error in Robert's table. It's been fixed to say backward sentry requires $cnull link and $cra target.

vmurali commented 6 months ago

I really like the table that @rmn30 made; I will change the checks to match the table exactly (with appropriate negations in the beginning) and fix the other typos/issues.

rmn30 commented 6 months ago

I really like the table that @rmn30 made; I will change the checks to match the table exactly (with appropriate negations in the beginning) and fix the other typos/issues.

I think it is easier (and safer) to think about the allowed cases. Summarising the above conversation and extending the table to include unsealed we have:

cs1 cd Used for Valid otypes
cra cnull Function return Return sentries only (4, 5)
not cra cnull Tail call to sentry Unsealed or interrupt inheriting forward sentry (0, 1)
any cra Function call Unsealed or forward sentries (0,1,2,3)
any not cnull, not cra Invalid None

Note that the first row does not allow unsealed. I am not sure whether we should allow unsealed in the last row? Probably not.

cjal should produce a return sentry in link register.

vmurali commented 6 months ago

I really like the table that @rmn30 made; I will change the checks to match the table exactly (with appropriate negations in the beginning) and fix the other typos/issues.

I think it is easier (and safer) to think about the allowed cases. Summarising the above conversation and extending the table to include unsealed we have:

cs1 cd Used for Valid otypes cra cnull Function return Return sentries only (4, 5) not cra cnull Tail call to sentry Unsealed or interrupt inheriting forward sentry (0, 1) any cra Function call Unsealed or forward sentries (0,1,2,3) any not cnull, not cra Invalid None Note that the first row does not allow unsealed. I am not sure whether we should allow unsealed in the last row? Probably not.

cjal should produce a return sentry in link register.

This is exactly how I implemented what I just pushed. I didn't allow unsealed in the last row. cjal produces are return sentry. I am pasting this table in the commit message.

vmurali commented 6 months ago

@rmn30 I added explanations in the spec PDF

davidchisnall commented 6 months ago

This prevents us from doing the spill / reload optimisations as library functions with a custom link register. We aren’t currently doing that, but:

It would be nice if we could have a way of requiring that the target is a valid forward sentry (no unsealed). Currently, you can do auipcc and then arbitrary arithmetic to generate a thing that can be used as a function pointer and points to any code (or read-only data that happens to be valid code) in your compartment.

I’m not sure how much this matters since you can get arbitrary code execution as long as you start with arbitrary code execution, but this may let you go from a constrained set of gadgets to a rich set. I suspect anyone who can do this can also build a Turing-complete weird machine out of real function pointers, but I’m not sure.

Having a way of encoding this would let us move to an ABI where all function pointers came from the import table, without changing the ISA.

rmn30 commented 6 months ago

@davidchisnall are you suggesting that rows 3 and 4 of the table should be the same, permitting cjal to any unsealed cap or forward sentry with any link register? In that case we might as well drop the restriction on row 2 as well, because using the null link register is not much different than trashing one. In this case the only restriction would be that cret (aka cjalr $cnull, $cra or cjr $cra) only works with a return sentry and return sentries only work with cret. I think that would still prevent the attack by turning it into a trap on cret. It's a bit unfortunate that it gives an easy way to cause a trap in a library function but it shouldn't make any difference as it was about to return anyway.

vmurali commented 6 months ago

David's suggestion will not merge rows 3 and 4. It will require that Any forward sentry should have some non-null link address, rather than just $cra.

Row 2 is necessary to allow null link addresses generated only through inherited sentries, otherwise the semantics of structured interrupts becomes messy (the caller and return-target could be in different interrupt states).

davidchisnall commented 6 months ago

Row 2 is necessary to allow null link addresses generated only through inherited sentries, otherwise the semantics of structured interrupts becomes messy (the caller and return-target could be in different interrupt states).

It doesn't fully guarantee structured programming but it does at least mean any interrupt-disabled function can trust its immediate return, which I agree is a very valuable property. Interrupt-disabled leaf functions are guarantee to return to their caller, which means you can then reason about call sites one layer up, inductively.

rmn30 commented 6 months ago

it does at least mean any interrupt-disabled function can trust its immediate return

That makes sense. In that case we can't merge rows 3 and 4. Maybe we should make row 4 the same as 2: unsealed or interrupt inheriting only?

vmurali commented 6 months ago

On Wed, May 22, 2024, 10:20 Robert Norton @.***> wrote:

it does at least mean any interrupt-disabled function can trust its immediate return

That makes sense. In that case we can't merge rows 3 and 4. Maybe we should make row 4 the same as 2: unsealed or interrupt inheriting only?

My suggestion is to change CD of row 3 to non cnull and remove row 4 (anything not in the first three rows are invalid).

Reply to this email directly, view it on GitHub https://github.com/microsoft/cheriot-sail/pull/54#issuecomment-2125369924, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABZURN6KUTTBP3E4UU3E73ZDTHV3AVCNFSM6AAAAABIAEOC7CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRVGM3DSOJSGQ . You are receiving this because you were mentioned.Message ID: @.***>

vmurali commented 6 months ago

it does at least mean any interrupt-disabled function can trust its immediate return

That makes sense. In that case we can't merge rows 3 and 4. Maybe we should make row 4 the same as 2: unsealed or interrupt inheriting only?

Why allow interrupt inheriting (or unsealed) for Row 4? I would prefer to fail for any case other than when the first three rows are not met. Is there a use case for row 4 with unsealed or interrupt inheriting?

davidchisnall commented 6 months ago

The kind of code size reduction things I mentioned above.

vmurali commented 6 months ago

The kind of code size reduction things I mentioned above.

Sorry I don't get it. The new table I am proposing is as follows:

cs1 cd Used for Valid otypes
cra cnull Function return Return sentries only (4, 5)
not cra cnull Tail call to sentry Unsealed or interrupt inheriting forward sentry (0, 1)
any not cnull Function call Unsealed or forward sentries (0,1,2,3)

Any other case is invalid.

davidchisnall commented 6 months ago

Currently, RISC-V uses JALR with a non-ra link register to do things like outlined prolog sequences. We don’t do this currently, but it would be nice to not prevent it. For this to work, we’d need cjalr with a non-cra link register to work with forward sentries.

vmurali commented 6 months ago

My new table allows that @davidchisnall

davidchisnall commented 6 months ago

I think that change loses us the guarantee that interrupt-disabled functions have a valid car to return to. I think we want only interrupt-inheriting sentries for link to not-cra.

vmurali commented 6 months ago

It still requires some register with a backwards sentry to return from. If you think there's no use case for a non cra returning function to change interrupt status, we can add that constraint

vmurali commented 6 months ago

Done. Updated the tables appropriately @davidchisnall @rmn30

vmurali commented 6 months ago

@microsoft-github-policy-service agree company="Google"

vmurali commented 6 months ago

I see that you did a merge commit. Isn't it better to do a squash merge. This gives a not helpful trail of commit messages

davidchisnall commented 6 months ago

Sorry, I should have checked the commit history.