Implement async mutex datastructure variants

theli-ua commented 4 years ago

Currently guard will use parking_lot's GuardNoSend

Firstyear commented 4 years ago

I'll have to think about this. I think there is no reason it can't be send today, but I'm curious what you are trying to achieve. Why not send/clone the BptreeMap itself and start the write txn at the destination? I'd also want to be sure that the ref to caller can be guaranteed to live long enough over a send too.

I'm just starting a rewrite of the tree internals right now to improve garbage collection, so that could impact this too.

theli-ua commented 4 years ago

Just to be able to use across async points.

Firstyear commented 4 years ago

Ahhhhhhh that makes much more sense now. Okay, what would be the needed changes do you think to make this capable of being Send?

Firstyear commented 4 years ago

Okay, it looks like both https://doc.rust-lang.org/std/sync/struct.MutexGuard.html and https://amanieu.github.io/parking_lot/parking_lot/struct.MutexGuard.html are not Send, so there must be a reason for this. I would say understanding that is probably key to being able to make the writetxn send.

Firstyear commented 4 years ago

Okay, I was reflecting on this overnight, and I think beyond the !Send issue, you'd have other issues with async anyway. Because of the nature of async, you have many tasks interleaving, and many of those could be write tasks. That's why tokio has special mutex types to handle this situation to prevent deadlocking between resources:

https://docs.rs/tokio/0.2.21/tokio/sync/struct.Mutex.html

The mutex guard here is Send and is able to be used with async, so I think that if you wanted to be able to use the BptreeMap, there would need to be a parallel version which uses this lock instead. Async is a bit like "taint", where it go through everything it touches. You can't use the async mutex in a non-async stack, and vice versa you can't use the non-async lock in an async stack.

So to resolve this, I think the path would be:

Create a parallel top-level bptree map.rs that changes parking lot to tokio mutex, and change any other required top level functions to be async
Write an async test suite to check it functions as expected, especially with multiple writers staying consistent.

The tree internals could remain the same, because they are all generally implemented with unsafe and assume you are behind the lock, all of the thread safety and memory ordering comes from the top level locking/map operations. So the scope to change to make this async capable would be low.

If you are interested, I'd love to review any commits on contributions, and to help answer any questions you have, or I'll probably do this later once my current tree internals rewrite is complete (I don't really have a need for async at this low level so it's not a high priority for me, but I will add it eventually because it's useful to others).

Hope that helps,

theli-ua commented 4 years ago

Right, though instead of using tokio's mutex you could still use parking lot but use reentrant mutex. usual mutex is not send because if you try to lock it from the same thread again you'll deadlock. Reentrant mutex allows that and thus is Send.

You could have that under a feature and it would be easier than using tokio/etc mutex since you wouldn't have to make anything async

Firstyear commented 4 years ago

I'd be pretty nervous about just assuming that reentrant mutex would "just work" in async, which is why I think tokio's is better to reach for here.

theli-ua commented 4 years ago

I mean, basically what I was asking for is just a Send variant of Txn handles :)

Firstyear commented 4 years ago

For sure! I just want to be sure that whatever is done is correct :) Having support to use txn handles in async would be great, but I think most of the complexity is with write txns of course. So I think for now, my plan is still tokio mutex, but I'll read up on the parking lot reentrant one to be sure of what I think in my head is the case. :)

Firstyear commented 4 years ago

Right, so I did some reading and there is a key difference:

/// This type acts similarly to an asynchronous [std::sync::Mutex], with one /// major difference: [lock] does not block. Another difference is that the /// lock guard can be held across await points.

Reentrant doesn't provide these guarantees, which means that while it may be Send, it could cause a deadlocking scenario in async, because you could have:

async task 1      async task 2

lock wr txn
start IO
yield
                  attempt to lock wr txn
                  *DEADLOCK*

Tokio's mutex in async task 2 on the attempt to lock would immediately fail and yield, allowing task 1 to continue IO to completion, which would then allow the wrtxn to be dropped, and then async task 2 could continue. The re-entrant mutex would not prevent this case, which could cause async task deadlocking.

As a result of this, I think my summary is:

Yes, async support in concread would be awesome
We need to use the tokio mutex type in place of parking_lot mutex for it
We need a way to genericise the tree based on the type of mutex guard in use

Hope that helps,

Firstyear commented 4 years ago

Actually, worse, if you used re-entrant in this case, the async task 2 would see an incomplete/partial TXN from async task 1 in progress, which could corrupt the tree/garbage states. So we really can't use reentrant here.

theli-ua commented 4 years ago

You could just just use send_guard parking_lot's feature for now. While that could technically deadlock (and it probably will in single threaded executor) it gives you Send, you could expose that as send_guard feature

Firstyear commented 4 years ago

I'd rather not because it can deadlock. I'd rather do it right for async/await, than do something that's not going to work.

Recently I've been thinking I'll need this functionality anyway, so it will happen :)

Firstyear commented 4 years ago

So I've been thinking about this a lot more today, and I think it may not be necessary to implement this in concread at all.

As all of these structures are concurrently readable, there are two situations we must consider. Read transactions and Write transactions. We can discuss this in terms of cowcell, as all the other structures use the same locking and update strategy.

Under a read situation, the lock to the active shared pointer is only required during the initialisation of the transaction. Here is the code:

    pub fn read(&self) -> CowCellReadTxn<T> {
        let rwguard = self.active.lock();
        CowCellReadTxn(rwguard.clone())
        // rwguard ends here
    }

As this is not an await point, and no guard is held outside of this function, then read transactions can be used already today with async/await. Consider a single thread executor and:

async task 1      async task 2
begin read
do work
                  begin read
                  do work

end read
                  end read

As the lock is only held and then released during "begin read" which is not awaitable, regardless of other yield locations, other reads will be able to begin, and those read txns could be sent across threads.

This means that only writes are the subject of this discussion. During a write we must have exclusive access to the data-structure to protect the elements within due to how we update. One can imagine a situation however where with async you would attempt something like:


async task 1      async task 2
begin write
start IO
                  begin write
                  start IO
end IO
commit write
                  end IO
                  commit write

Of course, this by definition can not exist. writes must be serialised else a fundamental property of these structures is violated. This means the true order of operations must be:

async task 1      async task 2
begin write
start IO
end IO
commit write
                  begin write
                  start IO
                  end IO
                  commit write

We do no care if read operations are interleaved in these async tasks, but we must not have two active writes. As you have noted though, at this time we are not able to send write txns over threads because we can't yield within them. Reading the tokio mutex documentation it states:

Contrary to popular belief, it is ok and often preferred to use the ordinary Mutex from the standard library in asynchronous code. This section will help you decide on which kind of mutex you should use.

The primary use case of the async mutex is to provide shared mutable access to IO resources such as a database connection. If the data stored behind the mutex is just data, it is often better to use a blocking mutex such as the one in the standard library or parking_lot. This is because the feature that the async mutex offers over the blocking mutex is that it is possible to keep the mutex locked across an .await point, which is rarely necessary for data.

I think that this is correct - if we allowed awaiting within the write txn, we risk write starvation. This also tends to indicate that perhaps there is some architectural needs that need to be addressed within your application. IE performing async IO when under the write lock.

Of course, today we would cause the async environment to dead lock if two spawned tasks attempted to take the same write txn on a structure as that would prevent yielding. This can be resolved in one of two ways:

As mentioned by tokio's mutex:

Additionally, when you do want shared access to an IO resource, it is often better to spawn a task to manage the IO resource, and to use message passing to communicate with that task.

This would mean you can have a queue or other method of message passing to a dedicated thread that only performs write operations.

The second option is wrapping tokio mutex for a writer task. This wrapping mutex would be async, meaning that attempting to lock could yield back to a task that already holds the mutex. Once the wrapping mutex is acquired, it is then guaranteed that no one else is holding the inner write txn, and you could use the blocking write txn begin, as this would be an uncontended path IE you are guaranteed to acquire the lock.

I think my recommendation at the moment would be to follow the second option, where you use a tokio mutex that allows awaiting to acquire the write txn, and then you work synchronously once you have acquired the write txn.

reference: https://docs.rs/tokio/0.2.22/tokio/sync/struct.Mutex.html

Firstyear commented 4 years ago

If you still disagree, I'd like a concrete code example of what you want to do under a write transaction that requires await.

theli-ua commented 4 years ago

One example would be if you're implementing some kind of key-value storage were part is persisted on a non-volatile media. One example would be classical LSM tree. Then you would like to have fully serializable write transaction so you lock out your in-memory (volatile) state at the beginning of your transaction. Then say you would like to implement classical insert/update operation where you would like to error out if key already exists/doesn't . In that case you would need to read data from your volatile part which would be your async operation across which you'd like to hold a guard.

Or say you have some kind of threadpool to which you'd like to send guard and some data for processing. Not necessarily even async.

As I was saying you could just provide a send_guard feature which would enable same feature on parking_lot.

Firstyear commented 4 years ago

Send_guard still doesn't solve the issue - Your example here is you want do do this:

let wr_txn = cowcell.write();
...
io.await;
wr_txn.commit();

Even in this example, send_guard does not help you because the moment you await on the io, another task may attempt to start the wr_txn which would potentially lead to dead lock.

saying this the following may work:

let tokio_wr = mutex.lock().await; // a tokio::sync::Mutex
let wr_txn = cowcell.write();
...
io.await()
wr_txn.commit();
// drop tokio_wr

You can test this pattern with cargo.toml containing:

concread = { version = "0.1", features=["parking_lot/send_guard"] }

But it's critical you provide the tokio_wr lock and that it lives longer than the cowcell. If that works, I could make a wrapper type that just does this to avoid the code duplication that would otherwise be necessary.

Firstyear commented 4 years ago

https://github.com/kanidm/concread/pull/41

kanidm / concread

Implement async mutex datastructure variants #38