Open aspiwack opened 4 years ago
The sort of essential characteristic of Source
is that it's a pure member of the LinHask
category. In the sense that there is no (total) inhabitant of Unrestricted Source
. The rest matters less, if source is to be a tool for building uniqueness abstractions.
Now that I think of it, the objection that I wrote just 5min ago above about Dupable
is a bit silly. Even it it wasn't, whenever making a library which uses uniqueness for correctness, you will need unsafe things, since your low-level implementation doesn't have uniqueness anyway. And Dupable
is not incompatible with uniqueness: linear mutable arrays can be made Dupable
, it'll happen by copy, rather than sharing, is all. They still have the uniquess guarantee.
But, more importantly, there are other ways to create unique thingy. For instance, if I have any guaranteed unique piece of data, I can create a new linear mutable array, and it will remain unique. And so there starts to be a bit of an n*m problem here. Where each type of unique pieces of data need many creation functions.
I'm failing to see what the problem is. Is there some boilerplate that you want to reduce? I think I would need an example.
The “problem” is that there are many contexts in which I could create a unique thing.
Say that I have been diligent, and I wrote a mutable array library with both
newArray :: Int -> Linear.IO Array
withArray :: Int -> (Array #-> Unrestricted a) #-> Unrestricted a
Let's leave aside that it's a bit silly to have to write both these function. (in my proposed scheme they are trivial instances of a more general function. Or such is the hope anyway).
But then you come along, and you create a new library with a new monad, let's call it Oy
. As it happens, the Oy
monad would work perfectly well for defining linear mutable array. But, then, I have no knoweldge (yet), of the Oy
monad. And maybe you created it for a purpose distinct from arrays. And so there is no oyArray :: Int -> Oy Array
anywhere. Then Utku comes along, and needs this oyArray
function. There is no safe way to define this function in this situation.
On the other hand, if we had these Source
things, then when creating the monad, you would just have to ask yourself: does this monad support Source
. And then, you would be supporting Array
-s for free, as well as any other such type.
Does defining newArray :: Linear.Monad m => Int -> m Array
help? If the monad is only used for uniqueness, newArray
could use unsafePerformIO
and not impose any particular monad at all.
hm, and then what if we want to create an array where there is not a monad? :thinking: I think I'm starting to understand this issue :)
That wouldn't be correct anyway, since with this definition getIdentity . newArray @Identity
would have type Int -> Array
, and this is wrong (it gives me no way of imposing uniqueness). Only specific monads will work.
Is there an abstraction that we can model as a type class here? A type class with meaning (e.g. via laws) is better than convention.
Maybe? That would actually be a helpful item, since all the convenience function would work for every source bearing type like this.
Something like
data Source
instance Dupable Source
newSource :: IO Source
withSource :: (Source #-> Unrestricted a) #-> Unrestricted a
-- Not an actual suggestion for names
class SourceBearer a where
grab :: Source #-> a
share :: a #-> (Source, a)
new :: SourceBearer a => IO a
new = grab <$> newSource
with :: SourceBearer a => (a #-> Unrestricted a) #-> Unrestricted a
with scope = withSource (scope . grab)
grabShare :: (SourceBearer a, SourceBearer b) => a #-> (b, a)
grabShare = share . grab
Though it's hard to give some real meaning to the type class. Besides it being useful. There probably ought to be an interaction between SourceBearer
and the Dupable Source
instance. Such as share . (\(consume -> (), x) -> x) = id
. But I'm not sure that such laws are supremely informative.
class SourceBearer a where
grab :: Source #-> a
share :: a #-> (Source, a)
It doesn't look like grab
allows to create an array as we need to also provide a size to create it, which kind of invalidates the signature of the class.
grab :: Source #-> Int -> Array
share :: Array #-> (Source, Array)
That's a fair point.
So, in order to fill the blanks, we would need an extra type family attached to the type class. And have grab
take extra argument of said type.
It's a tad convoluted, though. I'm not sure it's worth the type class. I don't know.
I had a problem today that I think is relevant to this issue.
Most operations on linear mutable vectors rely on the vector being unique, so the usual way to create a vector is to take a linear callback (alloc :: ... -> (Vector a #-> Unrestricted b) #-> Unrestricted b
).
Today, I was implementing the Monoid
instance for linear mutable vectors. Implementing mappend
went well, since we already got two vectors, we can create a new one using the fact that we have access to some unique values.
However, I had a problem when implementing mempty :: a
, because it doesn't provide a way to create a unique value. With the initial suggestion on this thread, we could have something like mempty :: Source #-> a
instead. Otherwise, I need something like mempty :: forall b. (a #-> Unrestricted b) #-> b
which is not pretty. Or we somehow have to accept that linear mutable vectors (at least with the current representation) are not monoids.
@utdemir this is an interesting enough question that we ought to make a separate issue to discuss it. This comment will act as a link between the two issues.
While everyone may already realize this, I think it's worth mentioning the prior art in GHC. We have
newtype ST s a = ST (State# s -> (# State# s, a #))
IO ~= ST RealWorld -- I have serious qualms about this one. See below.
Most unique things can be created within ST s
for any s
. A few (e.g., weak pointers) can only be created in IO
. We have
runST :: (forall s. ST s a) -> a
The linear type system seems to allow us (but not force us) to drop the quantification, so as I understand it, we could have
data FakeWorld
newtype ST a = ST (State# FakeWorld %1-> (# State# FakeWorld, a #))
runST :: ST (Ur a) %1 -> Ur a
Now
withThingy :: (Thingy %1-> Ur a) %1-> Ur a
withThingy f = runST (mkThingy >>= (pure $!) . f)
So as long as we have
mkThingy# :: State# s %1-> (# State# s, Thingy #)
we can get newThingyST
, newThingyIO
, and withThingy
in one go. In (pseudo?)code,
withAnything :: ST thing %1-> (thing %1-> Ur a) %1-> Ur a
withAnything make f = runST (make >>= (pure $!) . f
Can we have source carriers? Well, yeah, a source carrier supports
share :: thingy %1-> (# State# s, thingy #)
But is that really useful? I'm not totally convinced. Any time we would want it, we could have just captured the token resulting from creating the thingy. The only thing you could do with it is invoke another creation function, but you'll always be in some kind of monadic context then anyway.
The qualms: I think GHC is simply wrong to make IO ~ ST RealWorld
, because ST
and IO
are fundamentally rather different things, have different strictness semantics, and could, in theory, be implemented in totally different ways. So I see mkThingy#
as solely a convenience within the world of GHC. To put it another way, I think PrimMonad
should be imagined conceptually as having one method for each primop shared between ST
and IO
.
Question: is the thread-indexed ST
(call it STI
) still useful in a linearly typed context? I would imagine so.
newtype STI s a = STI (State# s %1-> (# State# s, a #))
runST :: (forall s. ST s a) %1-> a
That is, the result doesn't need to be unrestricted.
Ah, I guess share
probably is more useful than I'd imagined in context.
class Tok a ~ s => Stately s a where
type Tok a :: Type
share :: a %1-> (# State# s, a #)
realizeWith :: Stately s a => a %1-> (a %1-> STI s b) %1-> b
realizeWith a (STI f)
| (# s, a' #) <- share a
, (# s', b #) <- unSTI (f a') s
, () <- consumeState# s'
= b
Hrmm.... What I was imagining for multiple state threads won't work quite right. GHC has
withForeignPtr :: ForeignPtr a -> (Ptr a -> IO b) -> IO b
That's obviously wrong, because b
could retain a reference to the pointer. Linear types can make that safe with something like
withForeignPtr :: ForeignPtr a %1-> (Ptr a -> IO (Ur b)) %1-> IO (Ur b)
But this is too rigid—b
can't include anything restricted. We really only meant to exclude the pointer! With a state thread-indexed version of IO
, we'd have
withForeignPtr :: ForeignPtr s a %1-> (forall t. ((IOI s %1-> IOI t) -> Ptr t a -> IOI t b) %1-> IOI s b
Suppose something in the inner action needs to create an Array s c
. It can't do it "natively" in IOI t
. But it can use the function it's given to "lift" actions into the outer context.
Another option might be to make realizeWith
the method instead.
class Tok a ~ s => Stately s a where
type Tok a :: Type
realizeWith :: a %1-> (a %1-> STI s b) %1-> b
This maintains the balance of State#
tokens. A "closed" operation must take one state token and produce it at the end. For a moment, I thought maybe this would obviate the rank-2 quantification, but it doesn't. We need to make sure not only that some state token is turned in at the end of the withForeignPtr
action, but that the right one is turned in.
https://github.com/tweag/linear-base/issues/130#issuecomment-1095360774 The variable in ST may as well be a type level witness of the particular runST call (and hence state thread), since from the users perspective it could be instantiated to any type parameter. This is what allows you to distribute STRefs and other things tying them to the same thread, and in fact you can even hold onto them through a runST call. There is a non-ST example of the quantification trick called the key monad. The essential operations as quoted from the paper of the same name:
newKey :: KeyM s (Key s a)
testEquality :: Key s a → Key s b → Maybe (a :∼: b)
runKeyM :: (∀ s. KeyM s a) → a
s is always nominal and usually (always?) an infinite kind, really :∼: could be :~~: - this doesn't use typeable at all or fix the kind of the last key parameter. Key is probably an int and KeyM effectively State Int. You can implement ST (very inefficiently) using this. The last parameter in key is usually nominal but in the ST analogue it would be representational (you can coerce STRefs). I have actually used something like this with linear types, but keeping forall s is essential to that. If the key was linear, you could even change the type of an existing key by returning it (otherwise not safe). You don't need Ur always in these unique continuations, because you don't always need to destroy the thing at the end of the scope, but you still benefit from knowing it is a unique value of the type.
https://github.com/tweag/linear-base/issues/130#issuecomment-1152974825
I only just realised that not returning Ur isn't actually sound even if you don't need any clean up, since you can then duplicate the applied scope function e.g. (x = scope id), if x is evaluated once then used multiple times we lose uniqueness.
I want to share some further development on this issue. In that, I think that I know the solution but it requires GHC support and will take a little time to happen. But it's a convincing enough solution that I do feel comfortable calling it the solution, indeed. This solution is courtesy of the Linear Constraint paper.
Without further ado.
Let's imagine that we have linear constraints. That is I can write C %1 => T
, and C
is inferred, like an ordinary type-class constraint. But it is only allowed to be used linearly. For instance the following would be ill-typed because linearity isn't respected:
useC :: C %1 => Int
bad :: C %1 => (Int, Int)
bad = (useC, useC)
This is what the paper is about (how do you formalise this, and how do you solve for linear constraints).
Now, let's imagine that we have a special Linearly
constraint (also described in the paper). It comes with two built-in functions:
scoped :: (Linearly %1 => Ur a) %1 -> Ur a
io :: IO (Dict Linearly)
Now, this mirrors the Source
token from earlier. So we can user Linearly
instead of a token. The benefit is that Linearly
is discharged automatically by GHC, I don't have to deal with this.
That is, instead of
newSource $ \token ->
let x = Array.new token 42 in
…
I get to write
Linearly.scoped $
let x = Array.new 42 in
…
It's not super spectacular with just one Array.new
, though remember that the starting point, for all this, is that we want to be able to use a single scope for several arrays (or unique-pointer-like things in general).
To complete this story we need one more bit of magic. This one is a little harder, to implement. Specifically, how do we actually support writing two Array.new
in the same scope? In my original post, I proposed having Source
be Dupable
(together with a function produceSource :: Array a %1 -> (Source, Array a)
, which is more manual management that we want to avoid). The solution is to make Linearly
dupable as well. Though we want this to be managed by the compiler. That is, we just make it possible to write
Linearly.scoped $
let x = Array.new 42 in
let y = Array.new 57 in
…
This is desugared to:
Linearly.scoped $ \token ->
let (token1, token2) = Linearly.dup2 token in
let x = Array.new token1 42 in
let y = Array.new token2 57 in
…
(implementing this in the constraint solver is a little tricky, but as a naive strategy to see that it work, we can imagine generating a Linearly.dup2
everywhere that there are two different subterms, use a corresponding Linearly.consume
to remove the token
when they turn out to be not needed. This is quite wasteful though, so we will have to implement this in a more economical manner, which isn't as obvious, but isn't particularly scary either)
What do we replace produceSource
with? Nothing! Just add Linearly %1 =>
to your function if you are going to make new arrays in it. GHC will handle producing enough tokens for you.
EDIT: No, what I wrote here is incomplete. See bottom.
The following is already possible today, by using the uncommonly-used ImplicitParameters
extension:
{-# LANGUAGE LinearTypes #-}
{-# LANGUAGE ImplicitParams #-}
{-# LANGUAGE RankNTypes #-}
module Linearly (Linearly, linearly) where
-- Linearity token type. Intentionally does not implement `Movable`. Its constructor is private.
data Linearly = Linearly
linearly :: Movable b => ((?lin :: Linearly) => () %1 -> b) %1 -> b
linearly f = let ?lin = Linearly in f ()
This allows any function which would be unsafe to use in a non-linear context, to restrict itself to contexts in which the ?lin :: Linearly
implicit parameter is bound. In other words: it can only be used inside the linearly
scoping function and since that restricts return types to only types implementing Movable
, linear-only types cannot escape their linear context.
(And the Linearly
token itself cannot escape the context for the same reason.)
Functions which really only are allowed to work in a linear context (AKA functions which are sound only as long as their inputs/output are not aliased) can be constrained to work only under linearly
like so:
newMArray :: (?lin :: Linearly) => Int %1 -> a -> Array a
fromList :: (?lin :: Linearly) => [a] %1 -> Array a
-- etc.
Which then could be used as follows:
example list = linearly (\() ->
list
& Array.fromList
& Array.arrOMap (+40)
& Array.toList
)
This is not a poor man's implementation of 'linear constraints' as the implicit parameter can be happily used many times. Incidentally that is exactly why it significantly improves on the 'manual linear token producing/consuming' that was discussed in most of this thread.
I think this is sound. In other words:
I believe the core problem is 'how do we prevent a function that is safe only when called in a linear context from being called in a non-linear context'.
And that what I've presented here is an adequate solution to this problem (which side-steps the n*m
linear constructors issue).
EDIT: On closer inspection, since the implicit parameter can be used multiple times GHC does not consider the result of e.g. fromList :: (?lin :: Linearly) => [a] %1 -> Array a
to be linear. 😞
We could just make Linearly
be a regular (Dupable
but not Movable
) data type. But then we'd have to manage duplication manually, which is a little bit of a pain. If we want to avoid the pain of duping ourselves, then we have to teach the compiler to dupe implicitly, which is what the dupable linear constraint story is trying to achieve. I'd be (happily!) surprised if there was a solution that doesn't involve teaching this old compiler new tricks.
[because alliterative titles are always the best titles]
Edit: I now have a better solution outlined in https://github.com/tweag/linear-base/issues/130#issuecomment-1193835221
This is something which has been at the back of my mind for a while. And I'm not exactly sure how to approach it.
Here is the problem: when creating some piece of data which needs to be unique (either for resource handling of or mutation or what have you) there are actually many ways to create such a piece of data.
In linear-base, we use two different ones:
withThingy :: (Thingy #-> Unrestricted a) #-> Unrestricted a
newThingy :: M Thingy
run
function of the monad, if theThingy
is to stay uniqueIO
monad, and its cousin theRIO
monad.These are not completely independent ways to create a unique thingy, since
withThingy
can be turned into the monadic style with the continuation monad. But they are not the same either, if only becauseM
need not be a continuation monad.Sometimes one of this styles is required and the other one doesn't really work. But sometimes you may want both. Maybe I can create a linear mutable array with the pure interface, but also with
IO
. Note how neither is an instance of the other. So I'd basically need two distinct creation functions. Meh…But, more importantly, there are other ways to create unique thingy. For instance, if I have any guaranteed unique piece of data, I can create a new linear mutable array, and it will remain unique. And so there starts to be a bit of an
n*m
problem here. Where each type of unique pieces of data need many creation functions. That's unpleasant, to say the least.So I've been thinking that data should be able to carry some kind of token establishing that they are unique. I've been calling them
Source
in my head (we can arrange it so thatSource
is 0-width data, so that they are mostly free).The rest is a bit hazy in my head, still. But I imagine something like the following: we have a number of ways to build uniqueness sources
Moreover sources would be
Dupable
, I assume (which means that they don't guarantee uniqueness, but I think that it is unavoidable).Then we would have
(and maybe
withThingy
andnewThingy
as convenient shorthands for the appropriate composition)To be able to create a thingy out of, say, another thingy, without creating a new scope, we could use the following
Implementing this is what requires
Source
to beDupable
(this, and whatever consume or freeze function that we would have on the type, though these only requireConsumable
).By convention, unique things would always have at least a
fromSource
and aproduceSource
.Something along these lines anyway. Thoughts, further ideas, counterproposals? I want them all!