Closed facundominguez closed 4 years ago
To clarify my comment about error messages above: errors that say "expected X, got Y", are easier to deal with than a mess of "couldn't deduce A from context C". Regarding overlapping instances specifically, the problem with them is that they're only usable given some closed world assumption. If overlap is allowed, code that type checks now could stop type checking tomorrow because someone added a new instance. Even if we assume a closed world of instances, the fact that a tuple of arguments larger any tuples that has a specific instance would silently be treated like the single argument case (depending on what Coercible
instances are in scope), is problematic.
cc @goldfirere
It is often possible to avoid overlapping instances with the use of a closed type family, which would discern among the different cases and return something the type-class machinery could case on. I don't know of a well-written example of this trick, but it's in the folklore and I'm happy to expand on this if you're not familiar with it. And perhaps some cleverness with TypeError
and such can improve type errors. However, I think you'll always have the problem of "too big tuples".
Unfortunately, I'm lost for context here, and so I can't apply my general comments here to your specific case.
@goldfirere #135 proposes an alternative, namely to use a simple printf-like GADT. Except that in this case the format specifier is particularly simple: just a constructor specifying the number of arguments we intend to pass in. So you can have e.g.
call obj "getFoo" With2Args True 42
Here, True
and 42
are arbitrary arguments to some binary method getFoo
. But perhaps a closed type family would work just as well and avoid the somewhat redundant With2Args
specifier. IOW, the question is, if you had to implement today a degenerate printf
where the format string always says "take any given arguments and concatenate the result of calling show
on each", how would you do that? Would you still use a GADT like in Xi et al's original 2003 paper, or would you now use a closed type family?
Wouldn't the simplest fix here be
+ newtype Single x = Single x
+
-instance {-# OVERLAPPABLE #-} Coercible x => JNIArguments x where
+instance Coercible x => JNIArguments (Single x) where
sings _ = [SomeSing (sing :: Sing (Ty x))]
- jvalues x = [coerce x]
+ jvalues (Single x) = [coerce x]
?
It doesn't fix the need to put the arguments in a tuple, but I don't think that
call obj "getFoo" With2Args True 42
is easier to write and read than
call obj "getFoo" (True, 42)
Right, but what is Single
and a family of tuples, if not variants of an algebraic datatype? In the latter case, the set of possible constructors is closed, which indeed is what we want here. Making this a datatype furthermore improves the error messages for the reasons given above. So we end up in pretty much the same place as #135. The only remaining choices in the design space are whether to have
call :: ... -> Arity r -> r
or
data Args where
Args1 :: Coercible x1 => x1 -> Args
Args2 :: (Coercible x1, Coercible x2) => x1 -> x2 -> Args
call :: ... -> Args -> IO b
i.e. whether to dissociate the constructor from the arguments or not (more on that later).
Making this a datatype furthermore improves the error messages for the reasons given above.
I confess that I don't follow what error messages we are considering, once overlapping instances are avoided as above.
"Could not deduce instance of JNIArguments Int
"? I admit that improving that could be difficult. I didn't feel it would be so bad to need improvement, or not enough bad to sacrifice the tuple syntax at least.
This is not going to work:
data Args where
Args1 :: Coercible x1 => x1 -> Args
Args2 :: (Coercible x1, Coercible x2) => x1 -> x2 -> Args
The value of type Args
needs to be inspected in order to build the MethodID, and one such value is constructed for every invocation. Looks like the arity specification needs to be kept separate.
Looks like the arity specification needs to be kept separate.
Nah, scratch that. I think it can be done. Args would be used only in the user facing interface. The internal calls can do without it. It would depend on GHC optimizations, though, whether it can separate the arguments from the arity spec itself.
Looks like the arity specification needs to be kept separate [for performance reasons].
Indeed. Leaving us with #135 as the only solution that satisfies all requirements: good error messages, good performance. As you point out there, however, #135 is not complete. We need to push further down in the call graph the separation between type singletons and arguments. I will work on that this evening.
It looks like we know the type will end in IO blah
. Then we don't need these techniques. How about something like this:
class Coercible a
instance Coercible ()
instance Coercible Int
instance Coercible Bool
instance Coercible Char
data Ty a
type family J a
type instance J (Ty a) = a
class IsReferenceType a
instance IsReferenceType (Ty Ref)
class CallResult r where
instance Coercible b => CallResult (IO b) where
instance (Coercible a1, Coercible b) => CallResult (a1 -> IO b) where
instance (Coercible a1, Coercible a2, Coercible b) => CallResult (a1 -> a2 -> IO b) where
call :: (IsReferenceType (Ty a), Coerce.Coercible a (J (Ty a)), Coercible a, CallResult r)
=> a -> String -> r
call = undefined
data Ref
instance Coercible Ref
newRef :: IO Ref
newRef = undefined
test :: IO ()
test = do
ref <- newRef
() <- call ref "nullary" -- need the () <- so that GHC knows the result is Coercible
_ :: Int <- call ref "nullary returning int"
() <- call ref "unary" True
() <- call ref "binary" 'x' (5 :: Int)
return ()
No modern features! I'm worried I'm missing a key aspect of the design here, though.
Thanks, Richard. That could work for the Unsafe interface, but the Safe interface is using an abstract monad m
that has an instance of MonadIO
.
There are also the concerns about error messages involving CallResult being less pleasant than using GADTs.
Yes, generalizing over the monad is troublesome. But I think the error messages will be OK. The only problem would be if the programmer uses call
in a context that doesn't expect an IO something
.
I will think about generalizing over the monad.
Here's another stab:
class Coercible a
instance Coercible ()
instance Coercible Int
instance Coercible Bool
instance Coercible Char
data Ty a
type family J a
type instance J (Ty a) = a
class IsReferenceType a
instance IsReferenceType (Ty Ref)
data Nat = Zero | Succ Nat
type family NumArgs r where
NumArgs (a -> b) = Succ (NumArgs b)
NumArgs other = Zero
class CallResult_ num_args r where
instance Coercible b => CallResult_ Zero (m b) where
instance (Coercible a1, Coercible b) => CallResult_ (Succ Zero) (a1 -> m b) where
instance (Coercible a1, Coercible a2, Coercible b) => CallResult_ (Succ (Succ Zero)) (a1 -> a2 -> m b) where
type CallResult r = CallResult_ (NumArgs r) r
call :: (IsReferenceType (Ty a), Coerce.Coercible a (J (Ty a)), Coercible a, CallResult r)
=> a -> String -> r
call = undefined
data Ref
instance Coercible Ref
newRef :: IO Ref
newRef = undefined
test :: IO ()
test = do
ref <- newRef
() <- call ref "nullary" -- need the () <- so that GHC knows the result is Coercible
_ :: Int <- call ref "nullary returning int"
() <- call ref "unary" True
() <- call ref "binary" 'x' (5 :: Int)
return ()
The result monad is generalized over, but I needed to use a closed type family. I'm not sure about the quality of error messages here. Provided the call
appears after a <-
in a do
block, I think it would be OK.
We are going to try an approach as described in #137.
We introduce here a type class that provides the singletons of arguments as computed from their types but not their values. This makes possible caching the MethodID for multiple invocations.