isovector / type-sets

type level sets
BSD 3-Clause "New" or "Revised" License
67 stars 9 forks source link

Weird `Insert` behavior, type checker doesn't simplify all the way #13

Closed timoffex closed 3 years ago

timoffex commented 3 years ago

Consider this code defining a "heterogeneous set" using TypeSet (like a heterogeneous list but that guarantees that each type appears exactly once and that offers a more efficient way to get the value of a desired type):

data HSet (s :: TypeSet *) where
  HEmpty :: HSet (Empty)
  HBranch :: a -> HSet b1 -> HSet b2 -> HSet (Branch a b1 b2)

hempty :: HSet Empty
hempty = HEmpty

hinsert :: forall a s.
           (Insertable (CmpRoot a s) a s)
        => a -> HSet s -> HSet (Insert a s)
hinsert a hset = insert (Proxy :: Proxy (CmpRoot a s)) a hset

type family CmpRoot (a :: k) (s :: TypeSet k) :: Maybe Ordering where
  CmpRoot a Empty            = Nothing
  CmpRoot a (Branch x b1 b2) = Just (CmpType a x)

class Insertable cmp a s where
  insert :: Proxy cmp -> a -> HSet s -> HSet (Insert a s)

instance Insertable Nothing a Empty where
  insert _ a HEmpty = HBranch a HEmpty HEmpty

instance Insertable (Just EQ) a (Branch a b1 b2) where
  insert _ a hset = hset

instance forall a x b1 b2.
         ( Insertable (CmpRoot a b1) a b1
         , CmpType a x ~ LT )
         => Insertable (Just LT) a (Branch x b1 b2) where
  insert _ a (HBranch x b1 b2) =
    HBranch x (insert (Proxy :: Proxy (CmpRoot a b1)) a b1) b2

instance forall a x b1 b2.
         ( Insertable (CmpRoot a b2) a b2
         , CmpType a x ~ GT )
         => Insertable (Just GT) a (Branch x b1 b2) where
  insert _ a (HBranch x b1 b2) =
    HBranch x b1 (insert (Proxy :: Proxy (CmpRoot a b2)) a b2

I wrote this while debugging another problem where I get a "CmpType is not enabled" error to prove to myself that it is in fact enabled. And then I ran into this:

mySet1 :: HSet (Insert Int (Insert Float Empty))

-- Compiles
mySet1 = hinsert (1::Int) $ hinsert (1.3::Float) hempty

-- Doesn't compile!!
{-
    • Couldn't match type ‘'Empty’ with ‘'Branch Int 'Empty 'Empty’
      Expected type: HSet (Insert Int (Insert Float 'Empty))
        Actual type: HSet (Insert Float ('Branch Int 'Empty 'Empty))
    • In the expression:
        hinsert (1.3 :: Float) $ hinsert (1 :: Int) hempty
      In an equation for ‘mySet1’:
          mySet1 = hinsert (1.3 :: Float) $ hinsert (1 :: Int) hempty
    |
166 | mySet1 = hinsert (1.3::Float) $ hinsert (1::Int) hempty
-}
mySet1 = hinsert (1.3::Float) $ hinsert (1::Int) hempty

Somehow the type checker can't prove that HSet (Insert Int (Insert Float 'Empty)) is the same as HSet (Insert Float (Branch Int Empty Empty)). It looks like it got Branch Int Empty Empty by simplifying Insert Int Empty but then stopped short of simplifying the Insert Float ... part. I spent some time digging through cmptype, magic-tyfams and GHC, but can't figure out why and will appreciate any help! (I'm completely new to GHC plugins. Hopefully the answer to this helps solve a different issue where a TypeSet expression isn't being reduced)

isovector commented 3 years ago

Weird! It's late here, but I'll see if I can track down what's going on tomorrow. I haven't thought about this library in a year or so, so hopefully I can work it out again myself :)

isovector commented 3 years ago

Oh, I think I see the issue. TypeSet is a binary search tree, so building it backwards will obviously result in a different tree. That is to say, TypeSets support membership, but not really identity. It's fair to call this a bug, though I don't know if it will get fixed by my own volition.

It's interesting to check how Data.Set does its equality check --- which is not particularly clever: https://hackage.haskell.org/package/containers-0.6.4.1/docs/src/Data.Set.Internal.html#line-1212. We could implement the equivalent check as a type family in type-sets of the kind TypeSet a -> TypeSet a -> Bool, but that wouldn't be very useful in this context.

Alternatively we could add this directly into the typechecker plugin, solving constraints of the form Insert a (Insert b s) ~ Insert b (Insert a s), though I'm pretty sure this could be used to prove String ~ Bool or something equally silly. No good.

What do you think is the right move here? What problem are you trying to solve?

timoffex commented 3 years ago

Thanks for the quick response! It completely skipped my mind that one gets a different BST when building it in a different order...

I think this doesn't solve the other problem I was seeing, which involves much weirder code:

type family CombineConstraints (cs :: TypeSet Constraint) :: Constraint where
  CombineConstraints Empty = ()
  CombineConstraints (Branch c b1 b2) = 
    (c1, CombineConstraints b1, CombineConstraints b2)

-- Compiles
constraint1 :: CombineConstraints
               (Branch (MonadTrans t)
                (Branch (Monad m) Empty Empty) Empty)
            => m a -> t m a
constraint1 = lift

-- Doesn't compile
constraint2 :: CombineConstraints
               (Insert (MonadTrans t)
                (Insert (Monad m) Empty))
            => m a -> t m a
constraint2 = lift
{-
    • Could not deduce (MonadTrans t) arising from a use of ‘lift’
      from the context: CombineConstraints
                          (Insert (MonadTrans t) (Insert (Monad m) 'Empty))
        bound by the type signature for:
                   constraint2 :: forall (t :: (* -> *) -> * -> *) (m :: * -> *) a.
                                  CombineConstraints
                                    (Insert (MonadTrans t) (Insert (Monad m) 'Empty))
 =>                                                                                 
                                  m a -> t m a
-}

Maybe this also has a simple explanation that I'm not seeing, but it looks like no matter what BST is built by (Insert (MonadTrans t) (Insert (Monad m) 'Empty)), CombineConstraints (Insert (MonadTrans t) (Insert (Monad m) 'Empty)) should give me some nested tuple that contains MonadTrans t and Monad m.


I started writing out an explanation for how I got to this weird TypeSet-in-function-context situation, but the explanation became overwhelmingly long, so I extracted it to a gist: https://gist.github.com/timoffex/24cb7a38b3aab85eea2eac1886014963. It's not super important—I still want to understand what's wrong with the code above, but I guess it's good to type out my original motivation to avoid the XY problem.

I'm going to spend some more time thinking about whether I need TypeSet equality or what would be helpful for my use case. I don't yet know what is the right approach.

isovector commented 3 years ago

This one is standard type family stuckness. Because t and m are type variables, GHC waffles on solving the Insert constraint. In this case it would probably be fine, because the kinds are apart, but imagine Insert (Monad m) (Insert (Monad n) Empty). When m ~ n, this is a type set with only one element, but when m /~ n, it has two. As long as they're variables, GHC doesn't know which one to pick, and so it holds off, waiting for the type variables to be instantiated.

I think you can work around this by implementing your desired constraint as a typeclass.

timoffex commented 3 years ago

I learn new things about the type checker every day. This behavior makes sense. Thanks for explaining!

I’ll consider the type class approach. I also realized that I could use a sorted type list to get a type “set” with equality, which might be good enough for my purposes.

In terms of implementing equality for TypeSet, I think a necessary condition is to keep its constructors private. For example, the structure of an HSet depends on the structure of its TypeSet, so if we were to declare two TypeSets equal based only on contents, then we could write

hset :: X -> Y -> HSet (Branch X Empty (Branch Y Empty Empty))
hset x y = HBranch x HEmpty (HBranch y HEmpty HEmpty)

getXY :: HSet (Branch Y (Branch X Empty Empty) Empty) -> (X, Y)
getXY (HBranch y (HBranch x _ _) _) = (x, y)

-- Typechecks but the pattern match is wrong
getXY (hset x y)

Maybe one could define TypeSet with private constructors and with some helper type families like ToAscList :: TypeSet k -> [k]. I'm not sure whether HSet would still be possible to define though.

I'll close this issue because you answered both of my questions. I'll think more about this if I come up with a use-case for TypeSet equality that works without pattern-matching on its constructors. Thanks for the help!

(also, I basically learned Haskell throughout 2019 from your Reasonably Polymorphic blog, so thanks a lot for creating it!)

isovector commented 3 years ago

I’ll consider the type class approach. I also realized that I could use a sorted type list to get a type “set” with equality, which might be good enough for my purposes.

That sounds like the more straightforward approach, given what you've said about your problem. Sorry this weird project of mine took you for a few days' ride!

(also, I basically learned Haskell throughout 2019 from your Reasonably Polymorphic blog, so thanks a lot for creating it!)

Comments like these make the whole blog worthwhile <3