tpm2-software / tpm2-pytss

Python bindings for TSS
https://tpm2-pytss.readthedocs.io/en/latest/
BSD 2-Clause "Simplified" License
60 stars 44 forks source link

Add destructor for ESYS_TR handles #372

Open whooo opened 1 year ago

whooo commented 1 year ago

It would be useful if returned handles would be flushed when there is no reference left for them

williamcroberts commented 1 year ago

We would have to bind an ESAPI context to each esys tr and:

  1. When the ESAPI context is deleted or closed flush all associated transient ESYS_TRs
  2. When the ESYS_TR del is called flush on the active bound ESAPI context

This could affect things like:

  1. audit sessions could get unexpected commands
  2. Scripts not using a resource manager may want the transient objects resident within the TPM for subsequent actions.
whooo commented 1 year ago

I did some light testing and flushing a context doesn't seem to affect an audit session, while I couldn't find anything while having a fast look at the specification I guess the context calls don't affect audit sessions as resource managers could easily break them otherwise.

An explicit command might be a solution, so something like:

ectx = ESAPI()
handle, _, _, _, _ = ectx.create_primary(None, "ecc")
ectx.flush_on_unref(handle)

And simply create weak references between the ESAPI context and the handle.

williamcroberts commented 1 year ago

I did some light testing and flushing a context doesn't seem to affect an audit session, while I couldn't find anything while having a fast look at the specification I guess the context calls don't affect audit sessions as resource managers could easily break them otherwise.

Good point, I didn't even consider that. Here is the spec reason why:

Unless a command description indicates that no sessions are allowed, an audit session may be used with
any command. A command may have only one audit session.

So TPM2_FlushContext is tagged no sessions: TPM_ST_NO_SESSIONS.

An explicit command might be a solution, so something like:

ectx = ESAPI()
handle, _, _, _, _ = ectx.create_primary(None, "ecc")
ectx.flush_on_unref(handle)

And simply create weak references between the ESAPI context and the handle.

I'd like to just add to ESYS_TR a handler for __del__ and set the weak reference up within the function that created to the associated ESAPI context. This means all functions that return ESYS_TR's would have to be updated to add this mapping. A bit more work, but t'd like to make the programmer not have to do this. We could add an option to the ESAPI() constructor to enable or disable this behavior, I would enable it by default but I'm open to ideas on why to make it opt in.

whooo commented 1 year ago

Good point, I didn't even consider that. Here is the spec reason why:

Unless a command description indicates that no sessions are allowed, an audit session may be used with
any command. A command may have only one audit session.

So TPM2_FlushContext is tagged no sessions: TPM_ST_NO_SESSIONS.

Right, I was thinking of the audit exclusive flag, skipping sessions for other commands does seem to affect that.

And simply create weak references between the ESAPI context and the handle.

I'd like to just add to ESYS_TR a handler for __del__ and set the weak reference up within the function that created to the associated ESAPI context. This means all functions that return ESYS_TR's would have to be updated to add this mapping. A bit more work, but t'd like to make the programmer not have to do this. We could add an option to the ESAPI() constructor to enable or disable this behavior, I would enable it by default but I'm open to ideas on why to make it opt in.

I prefer it by default as well, specially since I created a bug that was leaking object handles in code running before the resource manager was online...

williamcroberts commented 1 year ago

I prefer it by default as well, specially since I created a bug that was leaking object handles in code running before the resource manager was online...

Seems like we agree on the approach then?

whooo commented 1 year ago

I have implemented this more or less but I hit a snag, it doesn't seem easy to control when the garbage collector calls __del__ on the ESYS_TR instance, which makes the behavior a bit inconsistent.

So I'm unable to write any useful tests, do you have any idea on how this could be solved?

williamcroberts commented 1 year ago

@whooo I don't think I full grok what the issue is, but could you disable the garbage collector for a certain portion of the test using gc.disable() and then call gc.collect() to force a collection when you need it to occur and then re-enable gc via gc.enable(). See:

whooo commented 1 year ago

The issue is the reverse, I can't get the garbage collector to finalize the instance during a test, I have tried some combinations of del instance, gc.collect() and time.sleep(), but it is always called after the ESAPI instance is finalized.

williamcroberts commented 1 year ago

The issue is the reverse, I can't get the garbage collector to finalize the instance during a test, I have tried some combinations of del instance, gc.collect() and time.sleep(), but it is always called after the ESAPI instance is finalized.

I thought we had a weak ref to ESAPI context in the ESYS_TR so that makes sense right?

whooo commented 1 year ago

Right, that works as expected, but I'm unable to test the __del__ flow, might be survivable tho. I'll upload a draft PR

williamcroberts commented 1 year ago

Right, that works as expected, but I'm unable to test the __del__ flow, might be survivable tho. I'll upload a draft PR

Could you just explicitly call delete on it, will that do it?

del(object)
>>> class FOO(object):
...     pass
... 
>>> f = FOO()
>>> del(f)
>>> class FOO(object):
...     def __del__(self):
...         print("BILL")
... 
>>> f = FOO()
>>> del(f)
BILL
williamcroberts commented 1 year ago

Maybe del the esapi context to get it collected and then del the handle or vice versa... Or perhaps make the week reference on an object that gets del'd in ESAPI.close() rather than the class object instance itself? Just spit balling, you probably tried stuff like this.

whooo commented 1 year ago

So in one the tests I have done something like:

        sym = TPMT_SYM_DEF(algorithm=TPM2_ALG.NULL,)                                                                                                                                                                                   

        session = self.ectx.start_auth_session(                                                                                                                                                                                        
            tpm_key=ESYS_TR.NONE,                                                                                                                                                                                                      
            bind=ESYS_TR.NONE,                                                                                                                                                                                                         
            session_type=TPM2_SE.HMAC,                                                                                                                                                                                                 
            symmetric=sym,                                                                                                                                                                                                             
            auth_hash=TPM2_ALG.SHA256,                                                                                                                                                                                                 
        )                                                                                                                                                                                                                              

        handles = list()                                                                                                                                                                                                               
        more = True                                                                                                                                                                                                                    
        while more:                                                                                                                                                                                                                    
            more, data = self.ectx.get_capability(                                                                                                                                                                                     
                TPM2_CAP.HANDLES, TPM2_HC.HMAC_SESSION_FIRST                                                                                                                                                                           
            )                                                                                                                                                                                                                          
            handles += list(data.data.handles)                                                                                                                                                                                         
            self.assertEqual(len(handles), 1)                                                                                                                                                                                          

        del session
        gc.collect()
        time.sleep(60)

        handles = list()                                                                                                                                                                                                               
        more = True                                                                                                                                                                                                                    
        while more:                                                                                                                                                                                                                    
            more, data = self.ectx.get_capability(                                                                                                                                                                                     
                TPM2_CAP.HANDLES, TPM2_HC.HMAC_SESSION_FIRST                                                                                                                                                                           
            )                                                                                                                                                                                                                          
            handles += list(data.data.handles)                                                                                                                                                                                         
        self.assertEqual(len(handles), 0)  

And fails on the last assertEqual. Pushed the draft to https://github.com/tpm2-software/tpm2-pytss/pull/398/ so you can see the underlying design, still have some tests and comments to add.