cortoproject / corto

A hierarchical object store for connecting realtime machine data with web applications, historians & more
https://www.corto.io
MIT License
87 stars 14 forks source link

Corto crashes when aligning mount that species sampleRate #634

Closed SanderMertens closed 6 years ago

SanderMertens commented 6 years ago

When a mount is created, objects need to be aligned and the mount policy specifies a sampleRate, the mount attempts to lock itself when aligned objects are delivered to the mount, so that they can be added to the mount queue in a thread-safe manner.

However, since the mount object is already locked by the same thread when this happens, the operation deadlocks (or the mutex operation fails).

The following code reproduces the issue:

    corto_object obj = corto_createChild(root_o, "data/obj", corto_void_o);
    corto_mount mnt = corto_declareChild(root_o, "config/influx", corto_mount_o);
    mnt->super.query = (corto_query){
        .select = "/",
        .from = "data"
    };
    mnt->policy = (corto_mountPolicy){
        .mask = CORTO_MOUNT_HISTORY_BATCH_NOTIFY,
        .sampleRate = 20.0,
    };
    corto_define(mnt);
SanderMertens commented 6 years ago

The fix ended up being more generic than a mount-specific fix. To ensure that users do not have to build checks in their code that verify whether an object is being defined or not, the behavior of corto_lock has changed. It now behaves similar to corto_declareChild, in that it

When the object is already defined, the behavior is unchanged.

This fix ensures that an application can safely call corto_lock before and in a constructor, without worrying about potential deadlocks.

An important limitation is that a user should never call corto_lock before defining an object, and the accompanying corto_unlock after defining an object, like this:

corto_object o = corto_declareChild(root_o, "foo", corto_int32_o);
// some init code
corto_lock(o);
// more init code
corto_define(o);
// ILLEGAL:
corto_unlock(o);

The corto_lock operation doesn't lock the object (it is already locked), so when the corto_unlock operation is called, it attempts to unlock an object that isn't locked.