compiling-to-categories / concat

Compiling to Categories
http://conal.net/papers/compiling-to-categories
BSD 3-Clause "New" or "Revised" License
431 stars 49 forks source link

Simple, dependable recipe for client modules #34

Open conal opened 6 years ago

conal commented 6 years ago

For a concat client module to compile, GHC must be able to see some rewrite rules and some newtype constructors. I usually include the following import lines:

import qualified GHC.Generics -- needed to avoid coercion holes (sorry)
import ConCat.Rebox  ()       -- rewrite rules
import ConCat.AltCat ()       -- rewrite rules

We need the GHC.Generics import purely due to how the concat plugin translates coercions (into coerceC from the CoerceCat class), together with the following restriction:

You can unwrap a newtype using coerce only if the corresponding constructor is in scope.

Without the import of GHC.Generics, the plugin will sometimes result in errors like the following:

ghc: panic! (the 'impossible' happened)
  (GHC version 8.0.2 for x86_64-apple-darwin):
        ccc - couldn't build dictionary for
  coerceC    
    @ (GD (Dual (-+>)))
    @ ((:.:) (V 10) (Bump (V 4)) Double)
    @ (V 10 (Bump (V 4) R)) :: CoerceCat
                                 (GD (Dual (-+>)))
                                 ((:.:) (V 10) (Bump (V 4)) Double)
                                 (V 10 (Bump (V 4) R)) =>
                               GD
                                 (Dual (-+>))
                                 ((:.:) (V 10) (Bump (V 4)) Double)
                                 (V 10 (Bump (V 4) R)):
  coercion holes:  [$dCoercible_aTpt
                      :: Coercible
                           (Vector Vector 10 (Bump (V 4) R))
                           ((:.:) (V 10) (Bump (V 4)) Double)
                    [LclId, Str=DmdType]
                    $dCoercible_aTpt =
                      MkCoercible
                        @ *
                        @ (Vector Vector 10 (Bump (V 4) R))
                        @ ((:.:) (V 10) (Bump (V 4)) Double)
                        @~ (U(hole:{aTpu}, Vector Vector 10 (Bump (V 4) R), (:.:)
                                                                              (V 10)
                                                                              (Bump (V 4))
                                                                              Double)_R
                            :: (Vector Vector 10 (Bump (V 4) R) :: *)
                               ~R#
                               ((:.:) (V 10) (Bump (V 4)) Double :: *)),
                    $dCoercible_aTpi
                      :: Coercible
                           ((:.:) (V 10) (Bump (V 4)) Double)
                           (Vector Vector 10 (Bump (V 4) R))
                    [LclId, Str=DmdType]
                    $dCoercible_aTpi =
                      MkCoercible
                        @ *
                        @ ((:.:) (V 10) (Bump (V 4)) Double)
                        @ (Vector Vector 10 (Bump (V 4) R))
                        @~ (U(hole:{aTpj}, (:.:) (V 10) (Bump (V 4)) Double, Vector
                                                                               Vector
                                                                               10
                                                                               (Bump (V 4) R))_R
                            :: ((:.:) (V 10) (Bump (V 4)) Double :: *)
                               ~R#
                               (Vector Vector 10 (Bump (V 4) R) :: *))]

The failure here results from the (:.:) constructor (Comp1) not being in scope. A solution is to add "import qualified GHC.Generics".

This obscure issue keeps biting users. Maybe we could have a single import line for all concat users, like "import qualified ConCat", in addition to (probably unqualified) imports for operations actually used in client code.

I just added a ConCat module in plugin/:

{-# OPTIONS_GHC -Wall #-}

-- | For importing qualfied by ConCat clients. Enables some rewrite rules and
-- Coercible instances. Unfortunately, doing so leads to warnings of unused
-- imports in the client module. How to resolve (without -Wno-unused-imports)?

module ConCat (module GHC.Generics) where

import GHC.Generics     -- needed to avoid coercion holes (sorry)

import ConCat.Rebox  () -- rewrite rules
import ConCat.AltCat () -- rewrite rules

Tried in a sample concat program (replacing those three imports). It works but generates a warning:

    The qualified import of ‘ConCat’ is redundant
      except perhaps to import instances from ‘ConCat’
    To import instances alone, use: import ConCat()

If I add the "()", then the coercion hole error returns. Makes sense, though not what we want. Hm.

A possible solution might be to change the coerceC method definition in CoerceCat (->) and other CoerceCat instances. Currently,

class CoerceCat k a b where
  coerceC :: a `k` b

instance Coercible a b => CoerceCat (->) a b where
  coerceC = coerce

When synthesizing a use of coerceC, the plugin is already looking at a Core coercion between a and b, which it discards. It tries to synthesize a dictionary for CoerceCat k a b (for target category k), which sometimes leads to a coercion hole error. Could we instead use the coercion at hand? It's unclear to me how, since the general coerceC method doesn't involve Coercible. (Could it? Some of the instances are tricky due to coercion roles.)

Better yet, I'd love to redesign the handling of casts/coercions, eliminating CoerceCat altogether. My original intent was to transform Core cast expressions into combinations of existing categorical operations, using RepCat for the newtype axioms and (.) for the coercion transitivity.

sellout commented 3 years ago

So, I think I have a (terrible) workaround for this. Remove Coercible entirely from the instance.

instance CoerceCat (->) a b where
  coerceC = unsafeCoerce

coerceC is only ever used to convert a case where we know we have a coercion, so while this instance is super-dangerous, it should be safe within the plugin. And it makes the user experience much better, until CoerceCat is replaced with RepCat.

sellout commented 3 years ago

@conal, do you have an opinion on this? I'm considering applying it to our fork, so if you see any problems with it, it'd be great to know before they bite us down the line.

And I would happily push the change upstream if you think it's acceptable.

The reason to apply it to this particular instance (although perhaps also document on the class that other instances should do it explicitly) is because often other instances are implemented in terms of the instances on (->). like

newtype MyCat a b = MyCat (a -> b)

instance CoerceCat MyCat where
  coerceC = MyCat coerceC`

so getting it "right" for (->) often ends up doing the right thing for user-defined categories.

conal commented 3 years ago

@sellout Thanks for the reminder about this issue. Offhand, what your suggesting sounds like a heuristic (i.e., undependable) and a brittle one at that.

How about the following alternative? Import ConCat as described above (without ()). Either suppress the redundant import warning (-Wno-unused-imports) or add a definition to export from ConCat that one can use benignly. For example, export ignore :: () from ConCat, and define _ignore = ignore in the importing module. Perhaps there’s an easier way.

sellout commented 3 years ago

We started using a recipe like you propose above, but we were soon adding more re-exports to that module, like Compose (..), Const (..), etc. And then we also have application-specific newtypes, so we have our own module Project.Concat that re-exports ConCat as well as our newtypes, plus a third level where some modules that call toCcc add additional "redundant" imports to bring other newtype constructors into scope.

So, maybe we have some other issue going on, but re-exporting GHC.Generics itself hasn't been enough to keep CoerceCat happy.

I agree that it's not an ideal solution, but part of our focus has been on how to make using the plugin as low-friction as possible, and AFAICT, the current newtype situation is very finicky.

I am very hesitant to use unsafeCoerce, which is why I asked about it first. I'm just struggling to find something else that helps. I managed to delete about 100 loc, almost entirely imports (plus a number of {-# OPTIONS_GHC -Wno-redundant-imports #-}), by trying this on our codebase ... but it's still a WIP, because I really don't want to do it!

conal commented 3 years ago

We started using a recipe like you propose above, but we were soon adding more re-exports to that module, like Compose (..), Const (..), etc. And then we also have application-specific newtypes, so we have our own module Project.Concat that re-exports ConCat as well as our newtypes, plus a third level where some modules that call toCcc add additional “redundant” imports to bring other newtype constructors into scope.

Yipes—what a mess! Let’s find a better way. This whole issue,

You can unwrap a newtype using coerce only if the corresponding constructor is in scope

seems dubious to me. I hope there is some way around this restriction or a way to change the compiling-to-categories plugin so as not to rely on coerce.

sellout commented 3 years ago

Yeah, I do like the path you started down of using RepCat instead and doing a deep translation of the coercion. I know you've mentioned that somewhere other than this ticket -- did you run into some serious blockers, or just a lack of time/motivation to follow through on it for the time being? That approach seems very attractive if it can work.

conal commented 3 years ago

I preferred that direction as well. I remember getting exhausted with the effort rather than encountering a definitive blocker. We could try again, working through the puzzle together and asking for expert help if we get stuck.

sellout commented 3 years ago

You are the expert help!

So, let me try to think through this a bit ... first, we'll need to use abst/repr to convert between two types that have the same in-memory representation. That doesn't mean they have the same ConCat.Rep.Rep representation, but we can make a reasonable assumption that they do, expecting users to derive HasRep via Generics or whatever?

Given that those pieces align, a Coercion from a to b needs to compose abst :: Rep a -> a and repr :: b -> Rep b, where Rep a ~ Rep b, and even though they have the same in-memory rep, we can't rewrite abst . repr to a NOP at any point, right? We actually need to do the conversion?

And finally, we need to map the Coercion into the actual operation we want to apply, so nominal coercions (like Refl) are basically id, where representational ones are abst . repr? With TransCo being (.) (which can rewrite the middle composition, like abst . repr . abst . repr -> abst . id . repr -> abst . repr)? And SymCo swapping the direction, NthCo doing a projection, etc.? There are obviously a ton of other cases, but I think that handful at least makes sense to me.

Do you have any remnants of your earlier attempt? I think a lot of these Coercion cases are going to have me scratching my head.

sellout commented 3 years ago

I guess I'm digging in the right place, because I just found this comment in GHC:

Conal Elliott also came across a need for this function while working with the
GHC API, as he was decomposing Core casts. The Core casts use representational
coercions, as they must, but his use case required nominal coercions (he was
building a GADT). So, that's why this function is exported from this module.
conal commented 3 years ago

… first, we’ll need to use abst/repr to convert between two types that have the same in-memory representation. That doesn’t mean they have the same ConCat.Rep.Rep representation, but we can make a reasonable assumption that they do, expecting users to derive HasRep via Generics or whatever?

Hm. Given the large amount of code the CtoC plugin takes from libraries not written with CtoC in mind, I doubt that this assumption is a safe one.

Given that those pieces align, a Coercion from a to b needs to compose abst :: Rep a -> a and repr :: b -> Rep b, where Rep a ~ Rep b, and even though they have the same in-memory rep, we can’t rewrite abst . repr to a NOP at any point, right?

By a “NOP”, if you mean id, then no, since the types don’t work out. The coerce function is an operational no-op that bridges the type difference.

We actually need to do the conversion?

Not at run-time, since the generated code will hopefully eventually become coercions again.

And finally, we need to map the Coercion into the actual operation we want to apply, so nominal coercions (like Refl) are basically id, where representational ones are abst . repr? With TransCo being (.) (which can rewrite the middle composition, like abst . repr . abst . repr -> abst . id . repr -> abst . repr)? And SymCo swapping the direction, NthCo doing a projection, etc.? There are obviously a ton of other cases, but I think that handful at least makes sense to me.

IIRC, a tricky aspect of translating coercions is that a lot of simplification has already been done by GHC. The whole complex construction was spelled out at one point, and then GHC helpfully simplifies down to what the plugin gets to see. Reversing that simplification was harder than I expected.

Do you have any remnants of your earlier attempt? I think a lot of these Coercion cases are going to have me scratching my head.

The git repo contains my attempts. Searching my notes, the last mention of TransCo I see is 2017-01-01, and the first mention of CoerceCat is 2016-12-25.

I wonder if it would be easier to get GHC tweaked so that our coercions worked without the explicit imports. The current policy was probably designed without considering core plugins.

nomeata commented 2 years ago

I wonder if it would be easier to get GHC tweaked so that our coercions worked without the explicit imports. The current policy was probably designed without considering core plugins.

Absolutely, this was my first thought as well. That policy is one for Haskell source, and doesn’t apply to Core; you are writing a Core plugin. I think a patch to GHC that adds a dflag to enable a “look-through-all-new-types” mode would be acceptable.