parallaxsecond / rust-tss-esapi

TSS 2.0 Enhanced System API (ESAPI) Rust wrapper
https://docs.rs/tss-esapi/
Apache License 2.0
85 stars 51 forks source link

Add certify example #521

Closed Firstyear closed 3 weeks ago

Firstyear commented 4 months ago

This shows how to make an AIK that has been endorsed and is used to certify other objects.

Firstyear commented 4 months ago

Okay, I've updated this with many thanks to @ionut-arm for teaching me about how this works. It seems to be working BUT I think this is highlighting a bug in this library.

I think we aren't cleaning up sessions properly.

This example fails with:

WARNING:esys:src/tss2-esys/api/Esys_StartAuthSession.c:391:Esys_StartAuthSession_Finish() Received TPM Error
ERROR:esys:src/tss2-esys/api/Esys_StartAuthSession.c:136:Esys_StartAuthSession() Esys Finish ErrorCode (0x00000903)
[2024-04-12T05:58:48Z ERROR tss_esapi::context::tpm_commands::session_commands] Error when creating a session: 0x00000903
thread 'main' panicked at tss-esapi/examples/certify.rs:348:10:
Invalid session attributes.: TssError(Tpm(FormatZero(Warning(TpmFormatZeroWarningResponseCode { error_number: SessionMemory }))))

Thing is, if we look at those lines:

https://github.com/parallaxsecond/rust-tss-esapi/pull/521/files#diff-d5e5d81654fd33f9a61e28fad7ac81325147e0e488198cc2b7ae84babc000e21R308

We clear sessions just before, then we run out of session memory.

I have recently been noticing a lot of issues with x0901 errors in the kernel due to session memory, and so I am led to think we have a bug somewhere in how we flush and handle sessions.

ionut-arm commented 4 months ago

Indeed, you seem to be calling clear_sessions and expecting them to be wiped from the TPM, but what actually happens is that the Context simply forgets about those sessions. At the end they're still in the TPM, and we've lost the handles for them.

Given that AuthSession is Copy and Clone, I don't think it's wise for us to implement Drop on them (and tbh that would require the sessions to carry an extra reference to Context). My suggestion would be: improve the docs around clear_session to make it clear that the sessions remain in the TPM; add a new method (maybe purge_sessions? or flush_sessions?) that flushes them from the TPM and then clears them from the context memory. Does that sound sensible?

Firstyear commented 4 months ago

You could give the AuthSession an inner Arc that has Drop instead?

But yes alternately we need flush_sessions or similar. :)

ionut-arm commented 4 months ago

You could give the AuthSession an inner Arc that has Drop instead?

The problem with that is the cascade effects to the interface: since you're flushing the session whenever the object gets destoyed, you can't have it implement Copy or Clone anymore. That means that every function which expects an AuthSession must be switched to &AuthSession. You also need to update all conversions, for example from AuthSession to PolicySession, because destructuring a Drop-able object is not allowed. It turns into a pretty big effort.

If you'd like to do this please go ahead! My suggestion was primarily based on expected change size. :)

Firstyear commented 4 months ago

You could give the AuthSession an inner Arc that has Drop instead?

The problem with that is the cascade effects to the interface: since you're flushing the session whenever the object gets destoyed, you can't have it implement Copy or Clone anymore.

Well you can have Clone because you can use Arc/Rc. Probably Rc in this case, because I think these are not really thread safe objects at all.

That means that every function which expects an AuthSession must be switched to &AuthSession. You also need to update all conversions, for example from AuthSession to PolicySession, because destructuring a Drop-able object is not allowed. It turns into a pretty big effort.

If you'd like to do this please go ahead! My suggestion was primarily based on expected change size. :)

Well, changing to have an inner that handles things with an Rc is fine, but the question here is do we actually have the apis exposed to flush a session so we can write the drop handler?

ionut-arm commented 4 months ago

Sorry, I should've been clearer! What I meant about Clone and Copy is that if you have multiple clones of the same session and you have "flush on Drop" for them, then the moment your first clone goes out of scope it will make the others unusable. So you might as well not allow cloning, as it'll lead to weird errors that seem to come out of nowhere.

do we actually have the apis exposed to flush a session so we can write the drop handler?

Of course, here's an example.

Firstyear commented 4 months ago

To help explain I'm suggesting we have:

AuthSession {
    inner: Rc<AuthSessionInner>
}

AuthSessionInner {
    handle: ....
}

impl Drop for AuthSessionInner {
    fn drop (&mut self) {
        // whatever we need to do to flush the session.
    }
}

That way the outher AuthSession is still copy/clone, but then when it's dropped we can still auto-flush.

ionut-arm commented 4 months ago

Ah, I see! Yes, that'll work, though Arc is probably preferable since multithreading shouldn't be discounted. I think sessions can be seen as immutable, so you don't have to worry about that either.

There might be some other issues that I'm not foreseeing, but hopefully implementable if you want to give it a go.

Firstyear commented 4 months ago

Ah, I see! Yes, that'll work, though Arc is probably preferable since multithreading shouldn't be discounted. I think sessions can be seen as immutable, so you don't have to worry about that either.

There might be some other issues that I'm not foreseeing, but hopefully implementable if you want to give it a go.

I think the main issue I'd forsee is the whole API currently still relies on a fair bit of manual memory management anyway. So having some types that auto-free and some that don't could be confusing.

Anyway, for now I updated the example with a manual drop of the session and it passes!

But eventually if you run these in a loop, duplication, duplication secret and certify all end up leaking memory and causing the TPM RM to lockup. I'm not 100% sure why, I can't find what I'm "not freeing".

These are the first examples that use Policy Sessions, so could it by that I need to free those also?

Firstyear commented 3 months ago

@ionut-arm Anyway, separate to me fixing session handling, I think a pre-lim review of this would be great then I'll get it sorted for merge by squashing and fixing the commit messages. Is that okay/

ionut-arm commented 3 months ago

@ionut-arm Anyway, separate to me fixing session handling, I think a pre-lim review of this would be great then I'll get it sorted for merge by squashing and fixing the commit messages. Is that okay/

Yes, apologies for the delay. Staying on top of things has been challenging as of late, will add this to my TODO stack, I'll hopefully get to it today or tomorrow!

Firstyear commented 3 months ago

No problem, we all get busy mate. Thank you!

Firstyear commented 3 months ago

Updated based on your comments :)

ionut-arm commented 2 months ago

Updated based on your comments :)

Will have another look when I get a bit of time 👀

Firstyear commented 2 months ago

Thank you :)

Firstyear commented 2 months ago

Spellung mistakes fuxed!

Firstyear commented 1 month ago

@ionut-arm Need your glowing stamp of approval :)

ionut-arm commented 4 weeks ago

I'm good to go, but your commit set seems a bit weird, I wonder if some rebasing happened. Looks like your PR is re-adding two commits from #530

Firstyear commented 3 weeks ago

I'll rebase and fix it :)

Firstyear commented 3 weeks ago

@ionut-arm @Superhepper rebased :)