immobiliare / sfs

Asynchronous Filesystem Replication
274 stars 26 forks source link

An improper locking bug(e.g., deadlock) on the lock state->access_mutex #16

Closed ryancaicse closed 2 years ago

ryancaicse commented 3 years ago

Hi, developers, thank you for your checking. It seems the lock state->access_mutex is not released correctly when return at Line 135 in the function sfs_begin_access? https://github.com/immobiliare/sfs/blob/bb0943a10893a61b608b85314d31120cec3861c6/fuse/util.c#L106-L139

Best,

streambinder commented 2 years ago

Yeah, an additional unlock is necessary right before return 1;. Would you open a PR?

ryancaicse commented 2 years ago

@streambinder Thank you! Please check the PR.

mokraemer commented 2 years ago

Looks right to me, but wondering, why this never got a problem. I assume the lock was released automatically. Unfortunately I don't have write permission to this repo. Some of my changes never got merged too.

streambinder commented 2 years ago

Actually PR closed as the current behavior is correct, as explained in https://github.com/immobiliare/sfs/pull/17. If agreed, please, @ryancaicse, close this issue.

ryancaicse commented 2 years ago

@streambinder Hi, it seems there could be a new problem. When the error is executed in the sfs_begin_access, the lock state->access_mutex is released. In such a case, the lock could be released again in the sfs_end_access(double unlocking), which is an undefined behavior.

int sfs_begin_access (void) {
    ...;
    if (!state->perm_checks) {
        ...;
        return 1;
    }

    pthread_mutex_lock (&(state->access_mutex));
    ...;
    if (!pwd && errno != 0) {
        ...;
        goto error;
    }
    if (pwd) {
        if (initgroups (pwd->pw_name, ctx->gid) < 0) {
            ...;
            goto error;
        }
    }

    if (setfsgid (ctx->gid) < 0) {
        ...;
        goto error;
    }

    if (setfsuid (ctx->uid) < 0) {
        ...;
        goto error;
    }

    ...;
    return 1;

error:
    pthread_mutex_unlock (&(state->access_mutex));
    return 0;
}

void sfs_end_access (void) {
    if (!state->perm_checks) {
        return;
    ...;

    pthread_mutex_unlock (&(state->access_mutex));
}
streambinder commented 2 years ago

@streambinder Hi, it seems there could be a new problem. When the error is executed in the sfs_begin_access, the lock state->access_mutex is released. In such a case, the lock could be released again in the sfs_end_access(double unlocking), which is an undefined behavior.

Functions sfs_begin_access and sfs_end_access are always queried right before operating on the partition. More specifically, look at https://github.com/immobiliare/sfs/blob/master/fuse/sfs.c#L52. BEGIN_PERM is a macro which returns (exits, then) if return value of sfs_begin_access is an error: that macro is used everywhere in the codebase before operating, ensuring that only if successful the operation is performed and, therefore, END_PERM (hence, sfs_end_access) will be called.

ryancaicse commented 2 years ago

@streambinder OK, thank you very much for your explanation. My problem is completely solved. Feel free to close it.