Leak in gssapi_client_mech_step

BlakeIsBlake commented 5 months ago

If HAVE_GSS_KRB5_CRED_NO_CI_FLAGS_X is true, we will acquire credentials from kerberos. There is no check if &text->client_creds already has a value, and so it's possible to overwrite the existing value, and leak the previous value of &text->client_creds.

On the server side, we check if server_creds already exists so that we release the memory before reassigning. We could do something similar on the client side to avoid a leak.

Note that those using this library are currently affected by this issue can use a local compile with the HAVE_GSS_KRB5_CRED_NO_CI_FLAGS_X flag disabled to avoid the leak until the underlying issue is resolved.

hyc commented 5 months ago

text->client_creds comes from params->gss_creds

https://github.com/cyrusimap/cyrus-sasl/blob/bd9146c25e3a6de45112b06b9f6ba8c81d0f9bd9/plugins/gssapi.c#L2041

The code you reference only executes if params->gss_creds == GSS_C_NO_CREDENTIAL

https://github.com/cyrusimap/cyrus-sasl/blob/bd9146c25e3a6de45112b06b9f6ba8c81d0f9bd9/plugins/gssapi.c#L2170

So how can there already be an existing value in text->client_creds?

BlakeIsBlake commented 5 months ago

I am not 100% sure, but it does happen -- I have an in-house ASAN repro that triggers a leak every time. Applying the fix I mentioned makes the leak disappear.

I'll see if I can't get you an answer to your question today, or genericize the repro to post here.

BlakeIsBlake commented 5 months ago

text_client_creds actually comes from gss_acquire_cred. We assign to a local variable client_cred but never back to params, so client_cred doesn't outlive gssapi_client_mech_step. The next time we call gssapi_client_mech_step, params will still not have client_cred, but text->client_creds will already exist. Am I missing something else here?

BlakeIsBlake commented 5 months ago

These last two assertions fail, but the first succeeds:

client_creds = text->client_creds;
assert(client_creds != GSS_C_NO_CREDENTIAL);
assert(client_creds == (gss_cred_id_t)params->gss_creds);
assert((gss_cred_id_t)params->gss_creds != GSS_C_NO_CREDENTIAL);

hyc commented 5 months ago

Since this is getting a Kerberos ticket within a single authentication attempt, it makes no sense for acquire_cred to be called more than once. I think the actual fix is to prevent that, not a simple leak fix.

hyc commented 5 months ago

Something like this:

diff --git a/plugins/gssapi.c b/plugins/gssapi.c
index c54a13c3..630ad691 100644
--- a/plugins/gssapi.c
+++ b/plugins/gssapi.c
@@ -2176,15 +2176,17 @@ static int gssapi_client_mech_step(void *conn_context,
                     desired_mechs = &mechs;
                 }

-                maj_stat = gss_acquire_cred(&min_stat, GSS_C_NO_NAME,
-                                            GSS_C_INDEFINITE, desired_mechs,
-                                            GSS_C_INITIATE,
-                                            &text->client_creds, NULL, NULL);
-                if (GSS_ERROR(maj_stat)) {
-                    sasl_gss_seterror(text->utils, maj_stat, min_stat);
-                    sasl_gss_free_context_contents(text);
-                    return SASL_FAIL;
-                }
+               if (text->client_creds == GSS_C_NO_CREDENTIAL) {
+                   maj_stat = gss_acquire_cred(&min_stat, GSS_C_NO_NAME,
+                                               GSS_C_INDEFINITE, desired_mechs,
+                                               GSS_C_INITIATE,
+                                               &text->client_creds, NULL, NULL);
+                   if (GSS_ERROR(maj_stat)) {
+                       sasl_gss_seterror(text->utils, maj_stat, min_stat);
+                       sasl_gss_free_context_contents(text);
+                       return SASL_FAIL;
+                   }
+               }
                 client_creds = text->client_creds;
             }

hyc commented 5 months ago

But again, I don't see how this would get called twice, since text->state should have progressed to the next state after the first time through.

BlakeIsBlake commented 5 months ago

I messed around with the python test file and created a case that has 0 maxssf but doesn't specify SPNEGO. We have this order of operations:

Client calls sasl_client_start.
Call into gssapi_client_mech_step.
text->state is SASL_GSSAPI_STATE_AUTHNEG.
Acquire client credentials using gss_acquire_cred.
Call into gss_init_sec_context. Set return to maj_stat.
maj_stat & GSS_CONTINUE_NEEDED == true
Return SASL_CONTINUE from gssapi_client_mech_step without changing text->state.
Client calls sasl_client_step.
Call into gssapi_client_mech_step.
text->state is SASL_GSSAPI_STATE_AUTHNEG.
Acquire client credentials using gss_acquire_cred.
Call into gss_init_sec_context. Set return to maj_stat.
maj_stat == GSS_S_COMPLETE == true
text->state = SASL_GSSAPI_STATE_SSFCAP
....

hyc commented 5 months ago

OK, so it's calling gss_acquire_cred redundantly, that's the actual bug. Though now I'm also wondering why gss_init_sec_context didn't complete the first time, can ignore that for now. The patch I linked above ought to fix this, can you try it? https://github.com/cyrusimap/cyrus-sasl/issues/843#issuecomment-2203630061

BlakeIsBlake commented 5 months ago

Confirmed your patch fixes the issue :)

cyrusimap / cyrus-sasl

Leak in gssapi_client_mech_step #843