Try to fix attestation and measurements bugs

L0czek commented 2 months ago

This PR tries to resolve issues with tests regarding measurements and attestation. Currently, the results are as follows:

	ACS TEST SUIT	PASS/FAIL
1	measurement_immutable_rim	PASS
2	measurement_initial_rem_is_zero	PASS
3	measurement_rim_order	PASS
4	attestation_token_verify	PASS
5	attestation_rpv_value	PASS
6	attestation_challenge_data_verification	PASS
7	attestation_token_init	PASS
8	attestation_realm_measurement_type	PASS
9	attestation_platform_challenge_size	PASS
10	attestation_rem_extend_check	PASS
11	attestation_rem_extend_check_realm_token	PASS
12	attestation_rec_exit_irq	!@???##&?

Legend:

!@???##&? - This test passes only when I run it without others (using the --selected-tests option). As of now, I wasn't able to debug why the interrupt doesn't trigger when the entire test suite is running. Unfortunately, there are more problems with this test, which will be discussed later.

Encountered issues

Some simple mistakes (these were fixed and doesn't require further discussion)

In ATTEST_TOKEN_INIT the challenge was read from wrong registers.
During the attestation token generation, the Realm Personalization Value was stubbed.

Using improperly initialized memory

Upon Realm's REC creation in the rmi REC_CREATE handler Islet does the following:

        let mut rec_granule = get_granule_if!(rec, GranuleState::Delegated)?;
        #[cfg(not(kani))]
        // `page_table` is currently not reachable in model checking harnesses
        rmm.page_table.map(rec, true);
        let mut rec = rec_granule.content_mut::<Rec<'_>>()?;

It uses the SafetyAssured and SafetyAssumed traits to basically convert a raw pointer to a Rust reference. Later, the rec object is initialized by a safe function Rec::init:

        match prepare_args(&mut rd, params.mpidr) {
            Ok((vcpuid, vttbr, vmpidr)) => {
                ret[1] = vcpuid;
                rec.init(owner, vcpuid, params.flags, vttbr, vmpidr)?;
            }
            Err(_) => return Err(Error::RmiErrorInput),
        }

Inside this Rec::init function, someone forgot to initialize the attest_state field. This is of course a simple mistake, but I think that it was caused by a bigger issue. I understand that Rust doesn't yet have any syntax allowing to construct an object in-place by providing some pointer. The current solution implemented in Islet is to cast the raw pointer to Rust reference and then call a some init function which assumes that it is operating on a valid object. That's why it is possible to forget about some fields in a constructor. I think we should discuss better ways of handling this issue, as either the Rec::init function should be made unsafe as this initialization is semantically unsafe, or we should design a better mechanism. Naturally, marking the init function as unsafe will not solve this problem, but it will be a warning that here Rust cannot ensure complete safety. The other solution might be to implement a mechanism similar to how the Box pointer can be created using a custom memory allocator. On the high level, this ensures that the created object will be placed in a memory region controlled by the provided allocator. In this scenario, the object is first created on the stack using the Rust typical struct creation syntax and is the moved to the allocated memory region. To minimize performance impact, Rust compiles relies on the LLVM framework to perform Return Value optimization. This should initialize the object directly in the destination, instead of copying it. In a nutshell, we could implement something like this:

    /// Allocates memory in the given allocator then places `x` into it.
    ///
    /// This doesn't actually allocate if `T` is zero-sized.
    ///
    /// # Examples
    ///
    /// ```
    /// #![feature(allocator_api)]
    ///
    /// use std::alloc::System;
    ///
    /// let five = Box::new_in(5, System);
    /// ```
    #[cfg(not(no_global_oom_handling))]
    #[unstable(feature = "allocator_api", issue = "32838")]
    #[must_use]
    #[inline]
    pub fn new_in(x: T, alloc: A) -> Self
    where
        A: Allocator,
    {
        let mut boxed = Self::new_uninit_in(alloc);
        unsafe {
            boxed.as_mut_ptr().write(x);
            boxed.assume_init()
        }
    }

I quickly patched this issue by initializing the attest_state field inside the Rec::init. Nevertheless, I still think that we should discuss how to rework the Recs, and Rds initialization as this defeats the point of using Rust and really starts looking like C or C++.

The `!@???##&?` test

The attestation_rec_exit_irq attempts to test whether reading the attestation token from rmm can be interrupted by an IRQ. Since, the IRQ is a lower level exception, it cannot interrupt the control flow of Islet. As a result, it needs to be actively checked during RSI handling. For example, the TF-rmm does this as follows:

while (attest_data->token_sign_ctx.state == ATTEST_SIGN_IN_PROGRESS) {
    attest_token_continue_sign_state(attest_data, res);
    if (check_pending_irq()) {
        res->action = UPDATE_REC_EXIT_TO_HOST;
        rec_exit->exit_reason = RMI_EXIT_IRQ;
        return;
    }
}

In Islet this is currently impossible to recreate.

Issue 1

pub fn get_token(challenge: &[u8], measurements: &[Measurement], hash_algo: u8) -> Vec<u8> {
    // TODO: consider storing attestation object somewhere,
    // as RAK and token do not change during rmm lifetime.
    Attestation::new(&plat_token(), &realm_attest_key()).create_attestation_token(
        challenge,
        measurements,
        hash_algo,
    )
}

... and ..


fn get_token_part(
    rd: &Rd,
    context: &mut Rec<'_>,
    size: usize,
) -> core::result::Result<(Vec<u8>, usize), Error> {
    let hash_algo = rd.hash_algo();
    let measurements = rd.measurements;

    // FIXME: This should be stored instead of generating it for every call.
    let token =
        crate::rsi::attestation::get_token(context.attest_challenge(), &measurements, hash_algo);

To make this test work as ARM has intended, we should fix these two TODOs, as these issues are a performance nightmare. When the creation of the realm token will be interruptible by IRQs we cannot regenerate the token every time the Realm asks for it.

Issue 2

    fn sign(secret_key: p384::SecretKey, data: &[u8]) -> Vec<u8> {
        let signing_key = p384::ecdsa::SigningKey::from_bytes(&secret_key.to_bytes())
            .expect("Failed to generate signing key");

        let signature: p384::ecdsa::Signature = signing_key
            .try_sign(data)
            .expect("Failed to create P384 signature");
        signature.to_vec()
    }

This code is responsible for signing the attestation token. We must rewrite it to a "init, update, finish" type API to implement an IRQ checking loop similar to how TF-rmm does it.

DUCT TAPE solution

To make this test working, I applied the following makeshift:

        {
            let (token_part, token_left) = get_token_part(&rd, rec, buffer_size)?;

            if is_irq_pending() {
                error!("IRQ is pending while fetching token");
                set_reg(rec, 0, INCOMPLETE)?;
                set_reg(rec, 1, 0)?;
                ret[0] = rmi::SUCCESS_REC_ENTER;
                return Ok(());
            }

            unsafe {
                let pa_ptr = attest_pa as *mut u8;
                core::ptr::copy(token_part.as_ptr(), pa_ptr.add(pa_offset), token_part.len());
            }
            if token_left == 0 {
                set_reg(rec, 0, SUCCESS)?;
                rec.set_attest_state(RmmRecAttestState::NoAttestInProgress);
            } else {
                set_reg(rec, 0, INCOMPLETE)?;
            }
            set_reg(rec, 1, token_part.len())?;
        }

The helper function:

pub fn is_irq_pending() -> bool {
    let val: u64;

    unsafe {
        core::arch::asm!(
            "mrs {}, ISR_EL1",
            out(reg) val
        )
    }

    return val != 0;
}

Thanks to this, the test passes, but for some reason when I run the entire test suite it fails. Besides that, this solution is a performance nightmare as it generated the token and then checks for IRQ and discards the token if an IRQ is pending.

Summary

This PR in its current state provides fixes and some makeshifts solutions that are necessary for the application provisioning to work. Naturally, the last test is not essential, as it only tests the performance and interrupt handling responsiveness. Nevertheless, we should fix it as will improve performance even on fvp emulator.

bitboom commented 2 months ago

Inside this Rec::init function, someone forgot to initialize the attest_state field. This is of course a simple mistake, but I think that it was caused by a bigger issue. I understand that Rust doesn't yet have any syntax allowing to construct an object in-place by providing some pointer. The current solution implemented in Islet is to cast the raw pointer to Rust reference and then call a some init function which assumes that it is operating on a valid object. That's why it is possible to forget about some fields in a constructor. I think we should discuss better ways of handling this issue, as either the Rec::init function should be made unsafe as this initialization is semantically unsafe, or we should design a better mechanism. Naturally, marking the init function as unsafe will not solve this problem, but it will be a warning that here Rust cannot ensure complete safety.

First, thank you for raising this issue. 👍

Overall, I very much agree with your opinion. I also agree with making Islet a safer Rust project rather than using C or C++.

I agree with making the Rec::init function unsafe. However, is it enough to make only the init function unsafe? How can we prevent mistakenly calling rec without initializing it through init? Perhaps we should mark all cases where we receive a pointer from the host and cast it to an object as unsafe.

In Islet v1.0, Based on expressions, the ratio of unsafe code is 21%. (refer to here). While it is important to precisely mark unsafe keywords in the current code and decide whether to mark them based on the appropriate call paths, I also believe that unsafe should not lower development performance in a fast-paced development environment.

Therefore, I prefer the latter part of your suggestion, which involves design improvements.

Additionally, I believe that we should actively use unsafe analysis tools. Using unsafe can cause other side effects.

In the above example, the developer can set the state appropriately within the unsafe fn init function, but can also set an invalid value. Since it is unsafe, it allows that.

In this case, using MIRI, which I have been applying, can detect such issues.

Below is an example mocked and verified using MIRI.

#[repr(u8)]
#[derive(Debug)]
enum State {
    Run = 1,
    Wait = 2,
}

#[derive(Debug)]
struct Rec {
    state: State,
}

fn main() {
    let mut a = Rec { state: State::Run };
    println!("Valid? => {:?}", a);

    let raw = std::ptr::addr_of_mut!(a.state);
    unsafe {
        let raw = raw as *mut u8;
        *raw = 0xff;
    }

    println!("Really? => {:?}", a);
}

When we run the above, the State ends up as Wait, but this is not the result we intended.

$ cargo run
Valid? => Rec { state: Run }
Really? => Rec { state: Wait }

MIRI can detect this.

$ cargo miri run
Valid? => Rec { state: Run }
error: Undefined Behavior: enum value has invalid tag: 0xff
  --> src/main.rs:2:10

Currently, I am integrating MIRI into Islet https://github.com/islet-project/islet/pull/341. I plan to verify areas where unsafe is used. Also adding it to the CI so that other developers can easily verify it.

zpzigi754 commented 2 months ago

Really nice to bring up all these issues in detail! I think that it would be good to post these on an issue page as well in addition to the current PR description. I've also encountered that non-initialized fields in REC such as psci_pending and host_call_pending, which are missed in rec init, could cause a problem.

jinbpark commented 2 months ago

Great description-!

The other solution might be to implement a mechanism similar to how the Box pointer can be created using a custom memory allocator.

I like this idea. I'm not sure if there is another solution (I think there might be) that can achieve the same goal, making a safe init function for Rec, Rd, or whatever having similar strict memory requirements.

@bitboom , can we have "Safety Trait" handle this initialization stuff? (there might be some difficulty, e.g., deciding when to initialize and when to not) It sounds like "Safety Trait" might be a proper place to do that. But, we might need to spend some time checking if it technically makes sense-

bitboom commented 2 months ago

@L0czek , @jinbpark

I wrote a draft PR https://github.com/islet-project/islet/pull/344 to discuss this.

Previously, safe-abstraction crates provides to parse an instance from a given address assume_safe. This draft adds assume_safe_uninit_with to initialize an instance at a given address.

This approach has been applied to create instances from granules. Let me know if you have any feedback and suggestions.

nook1208 commented 2 months ago

Thank you for the good description !

The current solution implemented in Islet is to cast the raw pointer to Rust reference and then call a some init function which assumes that it is operating on a valid object. That's why it is possible to forget about some fields in a constructor.

I agree that it's easy to forget some fields when intializing it. Because most of memory like Rec are allocated from the host, it's hard to force that all of the fields should be non-null when they're created.

Another way to handle this is to have the init function return a structure with all fields set like @bitboom did in #344

    pub fn new() -> Self {
        Self {
            context: Context::new(),
            attest_state: RmmRecAttestState::NoAttestInProgress,
            attest_challenge: [0; MAX_CHALLENGE_SIZE],
            attest_token_offset: 0,
            owner: OnceCell::new(),
            vcpuid: 0,
            runnable: false,
            psci_pending: false,
            state: State::Ready,
            ripas: Ripas {
                start: 0,
                end: 0,
                addr: 0,
                state: 0,
                flags: 0,
            },
            vtcr: 0,
            host_call_pending: false,
        }
    }
...
let mut rec = rec_granule.new_uninit_with::<Rec<'_>>(Rec::new())?;

It would be nice if there is a way to always guarantee this rule, but unfortunately I can't think of any right now.. :(

L0czek commented 2 months ago

I repushed the commits after formatting with cargo in case you want to merge it. On our side, we can start implementing the application provisioning on host. Most likely, we will need the entire stack with RMM somewhere in August / September to start integrating the developed components. As a result, you can take your time fixing stuff the proper way. Nevertheless, if you want to merge this PR feel free to do so. I such case I will create an issue and copy the description there so we can continue the discussion.

@bitboom I saw your draft, it looks promising. I will review it when you're ready (ping me somewhere).

islet-project / islet