Reconsider strictness in concurrent primitives

In one Haskell course, it was taught that you should never use modifyTVar because it leaks memory, always use modifyTVar' instead. This way, we have only the latter provided by Universum.

But gaining a bit of experience with concurrent primitives I started doubting whether this distinction is justified. I'm still looking for the truth in this question, but want to write down some thoughts.

First of all, note that STM has a quite special model, there are cases that it suits well and cases when it does not. I'm not well-acknowledged with how it is implemented inside, but probably the following statements would be close to real life.

If STM computation retries, it has to perform entirely again until the next retry or success.
When some STM variable changes, all computations involving the work with that variable which are "waiting in retry" wake up simultaneously and try to perform on available cores. This is sort of similar to notifyAll in java.

Apparently, in some cases, STM will be significantly less efficient than IO-based primitives (locks or CAS+locks like in unagi-chan package). We like STM because it provides much more control, but I have witnessed an attempt to use TQueue under a high load which caused an excessive number of STM retires (much higher than the number of successful completions), leading to enormous CPU consumption.

So note the problem with modifyTVar' - it performs the evaluation right during the transaction, which increases the chances of concurrent modification and further retry. Instead, we could only update thunks during the transaction (very quick) and perform evaluation right before, right after the transaction or even in a separate spark. Regarding the spark option, the literature says that evaluation of a thunk creates a black hole, and any other thread trying to evaluate that thunk in parallel blocks, but the first thread which performs evaluation gains a high priority in its Capability so is likely to finish the computation soon. This option seems to make sense when the computation is large and updated are sporadic.

Also, it's worth thinking about what cases are justified for using STM. It looks like queues are something you never want to have in STM under a high-load, better use unagi-chan package. For control structures which are updates several times during application lifetime, performance or memory leaks do not matter. Hopefully, practice shows what is better to be used here.

serokell / universum

Reconsider strictness in concurrent primitives #235