haskell / core-libraries-committee

96 stars 15 forks source link

Deprecate `stToIO` and introduce `safeSTToIO` #119

Closed noughtmare closed 1 year ago

noughtmare commented 1 year ago

Summary

The existence of the safe stToIO function enables one to use ST computations in multiple threads. This means we technically don't have the guarantee that ST computations run in a single thread which most people expect and base their programs on. I propose address this by deprecating stToIO, introducing one safe but more restricted replacement and one function which does the same thing but is explicitly marked unsafe, and adding documentation about this issue.

Example

We might think ST computations are single-threaded an thus do not need to use atomic operations (in fact, STRef has no atomic modify operation), so we might write a programST function as follows:

programST :: STRef s Integer -> ST s ()
programST ref = do
  n <- readSTRef ref
  if n <= 0
    then pure ()
    else do
      unsafeIOToST yield
      writeSTRef ref $! n - 1
      programST ref

(The unsafeIOToST yield is only to make it more likely that weird concurrent interleavings occur)

But we can actually use this function in an unsafe way:

countdownST :: Integer -> IO Integer
countdownST n = do
  ref <- stToIO (newSTRef n)
  forkIO $ stToIO (programST ref)
  stToIO (programST ref)
  s <- stToIO (readSTRef ref)
  pure s

main :: IO ()
main = do
  putStrLn . show =<< countdownST 1000000

Compiling with ghc -O2 T.hs -threaded and running with ./T +RTS -N2 sometimes gives me unexpected results:

$ ./T +RTS -N2
129

$ ./T +RTS -N2
100
Click to expand full repro source code ```haskell import Control.Monad.ST import Control.Monad.ST.Unsafe import Control.Concurrent import Data.STRef programST :: STRef s Integer -> ST s () programST ref = do n <- readSTRef ref if n <= 0 then pure () else do unsafeIOToST yield writeSTRef ref $! n - 1 programST ref countdownST :: Integer -> IO Integer countdownST n = do ref <- stToIO (newSTRef n) forkIO $ stToIO (programST ref) stToIO (programST ref) s <- stToIO (readSTRef ref) pure s main :: IO () main = do putStrLn . show =<< countdownST 1000000 ```

Proposed changes

I propose to introduce two new functions:

safeSTToIO :: (forall s. ST s a) -> IO a
safeSTToIO (ST f) = IO f

unsafeRealSTToIO :: ST RealWorld a -> IO a
unsafeRealSTToIO (ST f) = IO f

Additionally, I propose to deprecate (but not remove) stToIO suggesting users to use the new safeSTToIO or unsafeSTToIO instead.

I do not propose removing stToIO because that would break too many existing packages. Perhaps we can consider that at a later time, when most of the ecosystem has adapted to this change.

Addtionally, documentation should be updated to explain the unsafety.

Impact

The results from this hackage search shows that there are 674 matches of the string stToIO in 95 packages. Some matches may be in comments, using the more complicated pattern stToIO\s*[$(]|=\s*stToIO in an attempt to exclude occurrences in comments yields 515 matches across 71 packages.

Adding a deprecation warning lets package maintainers update their packages at their own pace, so I expect the migration to be a smooth process.

See also

https://gitlab.haskell.org/ghc/ghc/-/issues/22780 https://gitlab.haskell.org/ghc/ghc/-/issues/22764#note_473050

Bodigrim commented 1 year ago

I am still working on a more detailed plan for smoothing that process. Suggestions are welcome.

I would introduce safeSTToIO :: (forall s. ST s a) -> IO a and mark existing stToIO as deprecated with a suggestion to use either safeSTToIO or unsafeSTToIO. Potentially after three releases one can remove stToIO completely, however I do not see much harm for it to remain indefinitely as long as all consumers were properly warned about its dangers.

nomeata commented 1 year ago

If we deprecate stToIo I'd not add safeStToIO. The function looks like it's doing something clever, when it really is just the naive pure . runST, and the deprecation could just tell people to use that. (Unless safeStToIO does something special, as proposed by @treeowl below.)

treeowl commented 1 year ago

I've deleted my earlier comments, because they're rambling and confusing. Here's a condensed version:

You've made a convincing case that the type is wrong. You have not (as far as I can see) made any case that the implementation is wrong. We should have

stToIO :: (forall s. ST s a) -> IO a
stToIO (ST f) = IO f

That maintains the main advantage of stToIO over runST: it's considerably easier for GHC to optimize around it. Aside from the performance issue, stToIO x = pure (runST x) has the wrong semantics. We want

stToIO m  ===  pure () >> runST (pure @IO <$> m)

That is, we want all the ST actions to run when stToIO m is run, rather than waiting for its result to be forced. We don't want them to run if stToIO m itself is forced, which is why the above law includes pure () >>.

noughtmare commented 1 year ago

My original rationale for changing the implementation was that using safe functions is always preferable over unsafe functions if possible, but you make a good argument that the unsafe IO constructor is the simplest way to get the correct semantics and it also requires less optimizations by the compiler.

tomjaguarpaw commented 1 year ago

I would introduce safeSTToIO :: (forall s. ST s a) -> IO a and mark existing stToIO as deprecated

Strong agreement. We should be very wary about changing existing APIs.

noughtmare commented 1 year ago

I have updated the proposal with your suggestions. Am I correct in assuming that simply adding a deprecation warning is not considered a breaking change and thus does not need an elaborate migration plan?

treeowl commented 1 year ago

I disagree with the proposal as currently written. I believe the type of the existing function should be changed, and require any code that breaks to be rewritten. There is no semantic change that could cause silent failure.

noughtmare commented 1 year ago

@treeowl do you think the violation of the invariant that ST computations always run in a single thread is enough reason to warrant breaking changes? I feel like the community wants less breaking changes, so we should have a strong argument for them.

tomjaguarpaw commented 1 year ago

I feel like the community wants less breaking changes, so we should have a strong argument for them.

Strong +1. This proposal is a no vote from me if it breaks an existing API.

treeowl commented 1 year ago

@tomjaguarpaw , I've noticed a pattern that you seem to like to vote no.

tomjaguarpaw commented 1 year ago

I'm not sure that's true but I do like to clearly announce conditions for my voting intentions in case it helps proposers spend their time more efficiently.

Bodigrim commented 1 year ago

I have updated the proposal with your suggestions. Am I correct in assuming that simply adding a deprecation warning is not considered a breaking change and thus does not need an elaborate migration plan?

Correct.

@treeowl do you think the violation of the invariant that ST computations always run in a single thread is enough reason to warrant breaking changes?

Even very well motivated breaking changes would require an impact analysis and patches for all affected packages. And still can be rejected by the committee afterwards.

arybczak commented 1 year ago

IIUC if type signature of stToIO is modified, then most of its existing applications should compile without needing changes 🤔 It's only "dirty" usages that let mutable references escape the computation that would stop working, right?

It's relatively easy to check if this is the case by patching base in e.g. 9.2.x branch and trying to build these 71 packages with it. Maybe even no patches would be required.

tomjaguarpaw commented 1 year ago

IIUC if type signature of stToIO is modified, then most of its existing applications should compile without needing changes thinking It's only "dirty" usages that let mutable references escape the computation that would stop working, right?

I believe that's correct.

It's relatively easy to check if this is the case by patching base in e.g. 9.2.x branch and trying to build these 71 packages with it. Maybe even no patches would be required.

What about software depending on this interface that is not publicly available?

treeowl commented 1 year ago

Some software is meant to be broken.

arybczak commented 1 year ago

What about software depending on this interface that is not publicly available?

My experience of bumping GHC version in large private codebases over the years is that if I had to update stToIO to unsafeSTToIO in a couple of places, it would represent a tiny fraction of work that usually needs to be done to complete the process. YMMV of course :)

For the record, I don't really have a strong opinion on how this gets solved and I generally dislike breaking changes in GHC :) I just think this case is obscure enough that it might be worth considering both paths.

Bodigrim commented 1 year ago

CLC is composed of individuals with various attitude and taste for breakage. There is a chance for a breaking proposal to be accepted, even if a non-breaking approach is feasible, but it could be discouraging for a proposer to be rejected after doing a solid amount of work on detailed impact analysis and preparing patches. That said, if someone is motivated enough to check which of 71 (or 95) packages would be actually broken, it would provide an important insight.

treeowl commented 1 year ago

While some people would like to get rid of Safe Haskell, the community has not yet decided to do so. Therefore, I think an argument can lean on that. Control.Monad.ST is declared Trustworthy. While the particular example @noughtmare has given doesn't violate fundamental safety, it would be very easy to write one that would. Most obviously, the exact situation @noughtmare describes could lead to array indices running off the end of an array, in Trustworthy code that (reasonably) relies on ST being properly protected. So we could:

  1. Change Control.Monad.ST to Unsafe, which would break code.
  2. Deprecate stToIO while changing Control.Monad.ST to Unsafe, remove stToIO (breaking code), and then restore Control.Monad.ST to Trustworthy.
  3. Just change the type of stToIO to match the one it should have had all along. However much code this breaks, it will surely be less than the other options.
noughtmare commented 1 year ago

could lead to array indices running off the end of an array

As far as I know, you'll still need to use other unsafe operations to be able to do that, so I don't see this as an immediate problem. The only way this could lead to unsafety in Safe Haskell is if someone based a trustworthy annotation on the assumption that ST is always single-threaded.

Also note that that assumption has never been officially and unambiguously written down somewhere as far as I know. The original paper only really claims that forall s. ST s a guarantees single threaded execution although I agree that the paper is very vague about it. Another hint is that Data.STRef does not export any atomic versions like Data.IORef does, but that is hardly enough to base that assumption on.

All in all, I think that part of this proposal is to officially and unambiguously acknowledge this vague assumption that ST is always single threaded. That should also be added to the documentation of Data.STRef and Control.Monad.ST.

arybczak commented 1 year ago

FYI from a brief look at the search link from OP sadly I can immediately see that some packages would no longer compile (e.g. atomic-counter, unboxed-ref, I think bytestring-builder too).

treeowl commented 1 year ago

could lead to array indices running off the end of an array

As far as I know, you'll still need to use other unsafe operations to be able to do that

I believe that is correct.

so I don't see this as an immediate problem.

I believe that is incorrect. Whatever the official documentation or papers may or may not say, I believe essentially everyone has always assumed strict ST to be single threaded from the moment it was invented. I would classify changing the type of stToIO as a bug fix.

treeowl commented 1 year ago

@arybczak Would any of them require non-trivial changes?

arybczak commented 1 year ago

@treeowl I don't think so. Changing stToIO to unsafeSTToIO should always do the trick.

treeowl commented 1 year ago

@arybczak I mean to adapt them to the new type of stToIO.

treeowl commented 1 year ago

How about a compromise?

  1. Add safeStToIO. Deprecate stToIO. Possibly add lessUnsafeSTtoIO (the same as the current stToIO) to Control.Monad.ST.Unsafe (see below).
  2. After a release cycle, change the type of stToIO to the safe one, and remove the deprecation. I know this is an abuse of the deprecation pragma, but so is leaving something deprecated without ever removing it.

Do we need lessUnsafeSTtoIO? I think it might be a good idea. It's a drop-in replacement for the current stToIO that's guaranteed to produce exactly the same code. If someone just swaps in unsafeSTtoIO, their Core will end up with unsafe coercion gunk that (despite various heroic measures) can affect optimization. We can try to paper over that using a rewrite rule

{-# RULES
"lessUnsafe" forall m. unsafeSTtoIO m = case m of ST f -> IO f
 #-}

that matches the special case of ST RealWorld, but that requires a staged NOINLINE on unsafeSTtoIO (a staged INLINE doesn't work for some mysterious reason), and there's no guarantee the rule would fire everywhere we'd like it to.

noughtmare commented 1 year ago

I think it is unfortunate that the result would be that safeSTToIO and stToIO are exactly the same function in the end. That could be confusing.

re-xyr commented 1 year ago

What is the signature of unsafeSTtoIO? I didn't see it mentioned. Nevermind, it's an existing function...

lessUnsafeSTtoIO sounds like a good idea to me.

phadej commented 1 year ago

While some people would like to get rid of Safe Haskell, the community has not yet decided to do so. Therefore, I think an argument can lean on that...

I disagree. You can write as "unsafe" code with using IORef etc. Data.IORef is Trustworthy.

Community also doesn't agree what safety in Safe Haskell is about, so using it as an argument is the same as not having an argument.

treeowl commented 1 year ago

@phadej IORef is used in IO, which is certainly not supposed to be single-threaded.

@noughtmare Duplication is unfortunate, but how confusing is it really when the documentation explains it?

noughtmare commented 1 year ago

It's not ideal but I guess it would be fine if properly documented.

Bodigrim commented 1 year ago

The problem with changing stToIO :: ST RealWorld a -> IO a to stToIO :: (forall s. ST s a) -> IO a is that it might be puzzling to migrate. E. g., one might have used stToIO several times in the same module so that some occurences comply with the new signature automatically, but others suddenly fail to compile with an error you never seen before.

Is stToIO really such a big footgun to justify a breaking change? We are not making the language terribly much safer, one can still easily jump between ST and IO without taking much care.

There are situations where breakage is inevitable, e. g., almost everything touching type classes is a nightmare. But breakage in a low-priority area, which could have been easily avoided, is not the best idea IMO.

dcoutts commented 1 year ago

Here's an anecdote from an existing library. The zlib library uses ST for its FFI binding layer, because morally zlib is pure in the ST sense (if you allocate, mutate and don't expose the mutable state, the result is all pure). Then it uses stTOIO to expose both an ST and an IO version of the API.

Given the ST versions

mkStateST :: ST s (Stream.State s)
runStreamST :: Stream a -> Stream.State s -> ST s (a, Stream.State s)

it defines the IO versions

mkStateIO :: IO (Stream.State RealWorld)
mkStateIO = stToIO Stream.mkState

runStreamIO :: Stream a -> Stream.State RealWorld -> IO (a, Stream.State RealWorld)
runStreamIO strm zstate = stToIO (Stream.runStream strm zstate)

and then uses these to define the (more directly useful to library users)

compressStreamST :: Stream.Format -> CompressParams -> CompressStream (ST s)
compressStreamIO :: Stream.Format -> CompressParams -> CompressStream IO

decompressStreamST :: Stream.Format -> DecompressParams -> DecompressStream (ST s)
decompressStreamIO :: Stream.Format -> DecompressParams -> DecompressStream IO

Note that under the current proposal these two uses of stToIO would have to become uses of unsafeStToIO rather than the safe one.

More generally this tells us that we cannot take a safe ST API and turn it into an IO API. This feels somewhat regrettable.

(Elsewhere I've also argued for the STRefs to not use memory barriers. Yes I see the problem.)

treeowl commented 1 year ago

Since time is getting quite short for the next GHC release, could we at least

  1. Add safeSTtoIO.
  2. Add documentation to stToIO explaining its concurrency risks?
Bodigrim commented 1 year ago
  1. We are already past the feature freeze for GHC 9.6.
  2. Documentation does not require CLC approval.
noughtmare commented 1 year ago

We are already past the feature freeze for GHC 9.6.

Is base frozen at the same time as GHC?

Bodigrim commented 1 year ago

Or even earlier.

Bodigrim commented 1 year ago

@noughtmare could you please come up with a specific proposal and a migration strategy (if any)?

noughtmare commented 1 year ago

I think the current proposal is good enough as it is now.

aadaa-fgtaa commented 1 year ago

It seems to me that the proposal lacks a precise description of unsafeRealSTToIO safety, which is a very important detail that isn't really obvious.

Currently, there is nothing really unsafe about stToIO — it is merely reinterpreting STRef operations into corresponding IORef operations, made possible by STRef and IORef being exactly the same thing under the hood. While it may be non-intuitive that STRefs can trigger data races, they are ultimately no more unsafe than the same races using IORefs. So, although the example program produces non-deterministic output, with today's ghc it always prints some integer, instead of segfaulting or formatting my disk.

But if at some point STRef operations would use non-atomic operations, as suggested by the proposal, it would be possible to break type safety using unsafeRealSTToIO or stToIO (if still available), causing segfaults or worse. It seems to me that this would be a huge problem for any code that needs to efficiently share the implementation of a mutable data structure between ST and IO, affecting, for example, primitive, vector and hashtables.

Consequently, I don't really understand what exactly is the motivation behind this change. Is the goal here to make STRef operations non-atomic? In that case, it seems to be a huge breaking change, one that cannot be generally solved by changing stToIO to unsafeRealSTToIO or other 'local' transformations. Or is the goal just to make the non-intuitive behaviour of stToIO explicit? In that case I find the name 'unsafe' a bit misleading, but unfortunately I cannot really propose a better alternative.

P.S. It is also worth noting that 4x slowdown cited in the proposal isn't really relevant here — it was caused by ghc redundantly using sequentially-consistent memory ordering where acquire-release would suffice. In particular, on x86-64 'ordinary' memory operations are always acquire-release atomic (when the address is properly aligned, which is hopefully always the case for ghc), thus on x86-64 using non-atomic operations shouldn't make any difference. I don't have any numbers for other architectures through.

noughtmare commented 1 year ago

unsafeRealSTToIO is safer than unsafeSTToIO because it does not run the risk of violating referential transparency, but it is still unsafe in the sense that code which was written with the assumption that it would run in a single thread can now be run concurrently in multiple thread which can be a source data races that are not possible without unsafeRealSTToIO. In that sense the current stToIO, which is the same as unsafeRealSTToIO, is unsafe too.

But if at some point STRef operations would use non-atomic operations, as suggested by the proposal

I'm not proposing to change any semantics at all. Only documentation, deprecation of one function, and the addition of two functions.

Currently, STRef only has non-atomic operations. There are no counterparts to atomic...IORef in Data.STRef. Presumably because there has always been an assumption that ST computations run in a single thread and thus data races cannot happen. But I don't see how that can cause "segfaults or worse".

noughtmare commented 1 year ago

P.S. It is also worth noting that 4x slowdown cited in the proposal isn't really relevant here

Here's a benchmark which shows that atomic operations are about 4-5x slower than non-atomic operations:

import Data.STRef
import Control.Monad.ST
import Control.Concurrent
import Test.Tasty.Bench
import Data.IORef

programST :: STRef s Integer -> ST s ()
programST ref = do
  n <- readSTRef ref
  if n <= 0
    then pure ()
    else do
      writeSTRef ref $! n - 1
      programST ref

countdownST :: Integer -> IO Integer
countdownST n = do
  ref <- stToIO (newSTRef n)
  forkIO $ stToIO (programST ref)
  stToIO (programST ref)
  stToIO (readSTRef ref)

programIO :: IORef Integer -> IO ()
programIO ref = do
  b <- atomicModifyIORef' ref (\x -> if x <= 0 then (x, False) else (x - 1, True))
  if b then programIO ref else pure ()

countdownIO :: Integer -> IO Integer
countdownIO n = do
  ref <- newIORef n
  forkIO $ programIO ref
  programIO ref
  readIORef ref

main :: IO ()
main = defaultMain
  [                         bench "non-atomic" $ whnfIO $ countdownST 1000000
  , bcompare "non-atomic" $ bench "atomic"     $ whnfIO $ countdownIO 1000000
  ]

Note that I had to use IORef for the atomic operations because STRef doesn't have them.

The results on my machine with GHC 9.4.3 (which doesn't suffer from that bug you mention) are:

  non-atomic: OK (0.44s)
    7.06 ms ± 553 μs
  atomic:     OK (4.74s)
    36.7 ms ± 3.5 ms, 5.19x
aadaa-fgtaa commented 1 year ago

Currently, STRef only has non-atomic operations.

Ah, I seem to have misunderstood the meaning of the 'atomic' here. The motivation behind my comment was the fact that in terms of C11 memory model, ghc implements 'non-atomic' STRef/IORef operations with acquire-release atomics(!) to ensure type safety. Essentially, acquire-release ordering guarantees that if some thread acquire-loaded some memory location and observed data written by some release-store, then all stores preceding that release-store are also seen by said thread. In case of ghc, this ensures that no thread can observe uninitialised memory: without acquire-release semantic, a thread could load some object from IORef/STRef without observing the initialisation of that object. Subsequent pattern matching on said object would result in some garbage value being loaded, causing essentially undefined behaviour.

That said, I thought originally that by

I propose to restrict the type of stToIO to make it safe to use non atomic operations in ST computations.

you meant avoiding acquire-release atomics in favour of non-atomic operations for STRef, which would indeed only be 'segfault-safe' if there is no safe way to share STRef between threads. But now I realise that you probably meant data race safety instead.

So, the answer to my question is that the only unsafe thing about unsafeRealSTToIO is that someone might assume that data racing on STRef is impossible, right?

aadaa-fgtaa commented 1 year ago

Here's a benchmark which shows that atomic operations are about 4-5x slower than non-atomic operations:

Yes, I've definitely misunderstood your use of 'atomic' here. My point was that there are no non-atomic memory operations under the hood here, as both readSTRef and writeSTRef use acquire-release atomics (which are just plain loads/stores on x86-64, but not on armv8, for example).

noughtmare commented 1 year ago

Ah, I agree that my summary was misleading. I've changed it to focus more on the real problem and solution.

treeowl commented 1 year ago

An annoying problem with the current state of affairs: we expect that in ST, forall m. m >> _|_ = _|_. This is something GHC could use in demand analysis, though I don't think it does. It's also just a nice law to have. But with unrestricted stToIO, that law simply doesn't hold.

Bodigrim commented 1 year ago

@noughtmare could you please raise a draft MR, which we can vote on?

hasufell commented 1 year ago

@tomjaguarpaw , I've noticed a pattern that you seem to like to vote no.

You can count me in, unless:

  1. there's a deprecation period of, say, 3 years
  2. there are multiple volunteers who will prepare patches for the entirety of stackage during GHC release candidate phase
  3. the breaking change is well worth it

Exceptions may apply, but that's the baseline I'm looking for.

hasufell commented 1 year ago

While some people would like to get rid of Safe Haskell, the community has not yet decided to do so.

SPJ himself has laid it to rest:

https://discourse.haskell.org/t/deprecating-safe-haskell-or-heavily-investing-in-it/5489/64?u=hasufell

So, without any implied criticism of David and David (and I was a co-author on the original paper too) I think it would be plausible to lay it to rest.

hasufell commented 1 year ago

Since time is getting quite short for the next GHC release, could we at least

  1. Add safeSTtoIO.
  2. Add documentation to stToIO explaining its concurrency risks?

This seems like a good initiative.

parsonsmatt commented 1 year ago

My thoughts:

Single Threaded?

ST is an initialism for "State Thread." The documentation pretty clearly says that s refers to the "thread" of execution that the operation resides in:

A computation of type ST s a returns a value of type a, and execute in "thread" s.

It serves to keep the internal states of different invocations of runST separate from each other and from invocations of stToIO.

So, to what extent are we saying that you can guarantee that a STRef is only used in a single thread, and therefore, there's no data race?

With runST :: (forall s. ST s a) -> a, you cannot pass in an STRef s a - the skolem would escape and make a mess of your pots and pans.

With stToToIO :: ST RealWorld a -> IO a, can we share variables? Yes, but only if they are STRef RealWorld _. And you can only create those STRef if you're using stToIO to pick ST RealWorld _.

Suppose you write stToIO (newSTRef 'a') :: IO (STRef RealWorld Char). You cannot pass this STRef into a function that eventually calls runST, so any function with runST is "safe" from this break. This means that the caller - the person that writes runST or stToIO - is able to make the choice on whether or not you have STRef RealWorld or not in their specific use of a computation.

But the author of any ST code receives no guarantees. For a function STRef s a -> ST s b, you don't get any guarantees local to your code that your STRef cannot be changed out from under you. However, the author can work-around this by never accepting STRef as a parameter in their entry point.

For someone to be bit by this particular problem, they would need to:

  1. Define an API with the shape STRef s a -> ST s b - accepting a foreign STRef and returning an arbitrary ST action.
  2. Have a particular requirement that the STRef not be visible to anyone else, despite accepting it as a parameter.
    • But this particular requirement is very odd - why would they accept an STRef s a when an a is the same thing?

Then, the caller would have to knowingly provide an STRef Realworld _ into that function, mutate it in another thread, and encounter the bug. This would require two different people to make mistakes about the API in fairly confusing ways.

To me, this seems like a really far fetched scenario. Has anyone actually been bitten by this?

Costs and Benefits

So, the benefit is that the above scenario becomes impossible without calling an unsafe function - and that's cool, I am generally in favor.

However, the cost of this fix appears to be incurred by folks calling stToIO. On the "good path," they'll see a deprecation/warning message, pointing them to use either safeStToIO or unsafeStToIO (with unsafeStToIO only being necessary in the case that you're letting that s type escape).

On balance, it seems like both costs and benefit are fairly modest, with costs slightly outweighing benefits.

hmm

safeStToIO appears to be equivalent to pure $! runST k, which is more general - Applicative f => (forall s. ST s a) -> f a. This generalization may make it more useful, though it would be good to see documentation and examples on why you'd want that over pure $ runST k.

I think it definitely makes sense to update the docs and point out this potential situation. Providing the {safe,unsafe} variants in preparation for a future deprecation/removal cycle would give folks a release that doesn't warn. Then we can decide to deprecate/delete/etc as time passes.