Transaction.RunConstruct

jam40jeff commented 6 years ago

@the-real-blackh I believe that (other than what we have created issues for already) Transaction.RunConstruct is the only piece that differs between C# and the other code bases.

Do you think it should be ported to the other languages? The implementation is a bit involved, but it's only one added method to the public API. I think it can provide a lot of value in specific use cases. If you'd like, I can give more details on what it is and why it can be of use.

the-real-blackh commented 6 years ago

I had a look at the comment. Can you explain a little further? I don't quite understand why it's good.

jam40jeff commented 6 years ago

Sure, I'll write up an explanation of where and how I have used it and what it provided me over Transaction.Run() when I can get to a keyboard.

jam40jeff commented 6 years ago

You could say that transactions have two main "phases": (1) before Close() and (2) during Close(). For the sake of this discussion, I'll call them "construction" and "execution".

During the construction phase, the initial trigger is executed which sets the transaction in motion. This can be anything from a single Send() call on a sink to a user-defined lambda sent through Transaction.Run. After this construction phase completes, the prioritized queue has some set of actions to run. It does this during the execution phase, which itself is composed of sub-phases. However, its main responsibility is to execute all prioritized actions, which themselves can add more prioritized actions to the queue.

Currently, pseudocode for a trasaction would look something like the following:

lock(TransactionLock)
{
    runConstructionPhase();
    runExecutionPhase();
}

Most transactions (especially those started with a single Send()) spend most of their time in the execution phase.

However, since it is recommended to create FRP logic within a transaction (to allow for loops and to prevent missed first events), a common paradigm for a transaction is as follows:

// run some IO and obtain a result
Transaction.RunVoid(() =>
{
    // use result to construct an FRP graph (or collection of objects each containing FRP graphs)
    // send these constructed objects through a sink
};

In this usage, the construction phase can be a significant portion of the total time in the transaction lock. From some rudimentary performance analysis of code I have written, I have seen anywhere from ~50% to ~90% of the transaction spent in the construction phase when a large amount of FRP primitives are constructed within a single transaction.

What I realized is that (almost) nothing that happens in the construction phase actually requires locking out possible transactions on other threads. The only time global state is modified or accessed is during the execution phase, with the exception of Sample(). Thus, as long as we refrain from using Sample(), we can actually have a transaction which looks like the following:

runConstructionPhase();
lock(TransactionLock)
{
    runExecutionPhase();
}

This still brings all of the benefits of the regular transaction. All prioritized actions are still delayed until the execution phase, and all state changes still happen within the lock. The big advantage is that the construction phase does not hold the lock. Thus, we have locked out other threads for less time, and perhaps more importantly, we have allowed for other threads to trigger a cancellation of the transaction.

This brings me to a real world use case scenario. Lets say we have something that executed based on a string value in a text box. In real time as the user types, it triggers a stream which kicks off an IO process on a separate thread. This thread does some work and then constructs many FRP objects representing the results of the operation. Each new keypress cancels the previous operation, as the new results will preempt the old ones anyway. Thus, we have something like the following:

StreamSink<string> filterString = new StreamSink<string>(); // receives values from text box
StreamSink<...> latestResults = new StreamSink<...>;
filterString.Listen(s =>
{
    Thread.Start(() =>
    {
        var results = ...; // get data based on s

        Transaction.RunVoid(() =>
        {
            var o = ...; // convert results to FRP objects

            latestResults.Send(o);
        });
    });
});

Although this will function and most of the work is done on a non-UI thread, it can cause the application to not seem very responsive. Subsequent key presses will hang the UI thread while it waits to obtain the transaction lock so it can send through a new value. What we really need is the ability to send in a new request and cancel the previous one.

In order to do this, we want to prevent taking the transaction lock until we absolutely need it. Since the code:

var o = ...; // convert results to FRP objects

latestResults.Send(o);

takes up at least 50% to 75% of the transaction by time, it would be much better if it ran before the lock was taken. This is exactly what Transaction.RunConstruct does. It implements the second option above where the transaction lock happens only when the execution phase is about to begin.

If we combine this with some logic to cancel previous requests when new ones start, as well as logic within the Transaction.RunConstruct lambda to throw on cancellation, we now have the ability to cancel our operation even while we're constructing FRP objects by sending through a new value to filterString. The window where key presses are blocked has greatly been narrowed, so the app is much more responsive.

I will refer to this second type of transaction as "construction transactions" and the first as "regular transactions". Ideally, all transactions would be construction transactions, but there are a few limitations. The biggest one is that Sample() may not be used in a construction transaction. Therefore, we need both types of transactions to cover different use cases.

One other option would be to hide the fact that there are two types of transactions from the user, and always start a transaction as a construction transaction, only lifting it to become a regular transaction if needed because of the use of Sample(). I actually prefer this option as it would keep things more clean from an API perspective, but I would like to hear your thoughts on this.

(To keep the discussion above simple, I didn't mention that there's a second method which may only be used in a regular transaction, AttachListener or addCleanup in Java. I think this restriction could be worked around by using a different lock just for this method.)

jam40jeff commented 6 years ago

The more I think about this the more I think the behavior I mentioned above should be automatic within the single existing Transaction.Run method. Rather than needing to be a separate API method, it is really just an implementation optimization. The only externally visible behavior change is that when the lock is actually taken is hidden from the user (it is not taken as soon as Transaction.Run is called), but I don't see this as a problem. Transaction.Run is meant to provide guarantees described in the denotational semantics, not to provide synchronization. Languages already have other tools for that.

jam40jeff commented 6 years ago

I implemented the optimizations I discussed above as part of #145. Merging the optimization into Transaction.Run() actually greatly simplified the transaction code quite a bit and I am happy with it. If you have a chance to review, please let me know if you have any questions or concerns.

SodiumFRP / sodium

Transaction.RunConstruct #142