clash-lang / clash-compiler

Haskell to VHDL/Verilog/SystemVerilog compiler
https://clash-lang.org/
Other
1.42k stars 150 forks source link

Clash inliner just can't even with this "benignly recursive" code #1611

Open gergoerdi opened 3 years ago

gergoerdi commented 3 years ago

While being blocked on #1536, I started working on a new design for my memory mapper, based on the idea of tracking backpane signals separately. The new version is... somewhat complicated in its API since it uses a graded monad to track the backpane connections. But, it has no knot-tying-like shenanigans going on -- the only recursion is structurally into the Addressing tree, so it should be very nicely bounded for small memory layout descriptions.

However, when I try using it, it still seems that Clash isn't able to synthesize it:

Clash.Normalize.Transformations(432): InlineNonRep: RetroClash.Memory2.$fMonoidFanIn_z[8214565720323925287] already inlined 40000 times in:Hardware.TinyBASIC.Intel8080.logicBoard[4876]
Type of the subject is: Clash.Signal.Internal.Signal[8214565720323789676]
  "Native"
  (Data.Monoid.First[8214565720323790477]
     a[8214565720324055698])
Function Hardware.TinyBASIC.Intel8080.logicBoard[4876] will not reach a normal form, and compilation might fail.
Run with '-fclash-inline-limit=N' to increase the inlining limit to N.

Full code is at https://github.com/gergoerdi/clash-tinybasic/tree/memory-map-backpane.

christiaanb commented 3 years ago

Thanks for the report. At the moment we don't have the resources to minimize the example, and this bug is unlikely to get fixed until there is a minimal(ized) example.

gergoerdi commented 3 years ago

Sure, I'm already working on minimizing it.

gergoerdi commented 3 years ago

Oh ho ho, turns out the minimized test for https://github.com/clash-lang/clash-compiler/issues/1536#issuecomment-740544378 actually is this in disguise: running with -fclash-debug DebugSilent and -fclash-inline-limit=1000, we can see that the normalizer chokes on this input, even though, conceptually, there isn't really that deep a recursion going on (since the runAddressing block is fairly small):

Clash.Normalize.Transformations(432): InlineNonRep: RetroClash.Memory.firstIn[8214565720324031151] already inlined 1000 times in:Board.logicBoard[398]
Type of the subject is: Clash.Signal.Internal.Signal[8214565720323789676]
  "System"
  (GHC.Maybe.Maybe[3674937295934324792]
     a[6989586621679260264])
Function Board.logicBoard[398] will not reach a normal form, and compilation might fail.
Run with '-fclash-inline-limit=N' to increase the inlining limit to N.

Full code is at https://github.com/gergoerdi/clash-issue-1611/tree/clash-issue-1611

gergoerdi commented 3 years ago

Here's a fully self-contained way of reproducing the inliner choking. This version runs for 10+ minutes with an inliner limit of 1000, before finally throwing in the towel with

Clash.Normalize.Transformations(432): InlineNonRep: c$Board.board_ds3[1990] already inlined 1000 times in:Board.board[388]
Type of the subject is: Data.Functor.Identity.Identity[8214565720323807699]
  (GHC.Tuple.(,,)[3746994889972252678]
     (GHC.Tuple.(,)[3746994889972252676]
        (Board.Susps[8214565720323976633]
           "System"
           (Clash.Sized.Internal.Unsigned.Unsigned[8214565720323789711]
              8)
           (GHC.Types.[][3891110078048108577]
              (GHC.Prim.TYPE[3674937295934324912]
                 GHC.Types.LiftedRep[3891110078048108766])))
        (Board.Component[8214565720323976659]
           (Clash.Sized.Internal.Index.Index[8214565720323789709]
              2048)))
     GHC.Types.Int[3674937295934324766]
     (Board.AddrMap[8214565720323976653] "System"))
{-# LANGUAGE DataKinds, GADTs, PolyKinds, KindSignatures, TypeOperators #-}
{-# LANGUAGE RebindableSyntax, LambdaCase #-}
{-# LANGUAGE GeneralizedNewtypeDeriving, DerivingStrategies #-}
module Board where

import Clash.Prelude
import Clash.Annotations.TH

import Data.Kind
import Data.Singletons.Prelude.List (type (++))
import Control.Monad.RWS hiding (Product)

import Data.Maybe
import Data.Kind
import Data.Dependent.Map as DMap
import Data.Dependent.Sum as DSum
import Data.GADT.Compare
import Type.Reflection

data Component (addr :: Type) = Component (TypeRep addr) Int

instance GEq Component where
    geq (Component a _) (Component b _) = geq a b

instance GCompare Component where
    gcompare (Component a _) (Component b _) = gcompare a b

newtype FanIn dom a = FanIn{ getFanIn :: Signal dom `Ap` First a }
    deriving newtype (Semigroup, Monoid)

newtype AddrMap dom = AddrMap{ addrMap :: DMap Component (FanIn dom) }
    deriving newtype (Monoid)

instance Semigroup (AddrMap dom) where
    AddrMap map1 <> AddrMap map2 = AddrMap $ unionWithKey (const mappend) map1 map2

newtype Susp0 dom dat addr = Susp0{ unSusp0 :: Signal dom (Maybe addr) -> Signal dom (Maybe dat) }
newtype Susp1 w dom dat addr = Susp1{ unSusp1 :: Signal dom (Maybe addr) -> (Signal dom (Maybe dat), w) }

type Addressing0 dom addr dat = RWS
    (Signal dom (Maybe dat))
    (AddrMap dom)
    Int

data Addressing dom addr dat (ts :: [Type]) a where
    Return :: a -> Addressing dom addr dat '[] a
    Bind :: Addressing dom addr dat ts a -> (a -> Addressing dom addr dat us b) -> Addressing dom addr dat (ts ++ us) b

    Fresh :: (Typeable addr') => Addressing dom addr dat '[] (Component addr')
    WR :: Addressing dom addr dat '[] (Signal dom (Maybe dat))
    Match :: (addr -> Maybe addr') -> Addressing dom addr' dat ts a -> Addressing dom addr dat ts a
    Connect :: Component addr -> Addressing dom addr dat '[] ()
    Tell0 :: DSum Component (Susp0 dom dat) -> Addressing dom addr dat '[] ()
    Tell1 :: DSum Component (Susp1 w dom dat) -> Addressing dom addr dat '[w] ()

data Susps dom dat (ts :: [Type]) where
    NilS :: Susps dom dat '[]
    Cons0 :: DSum Component (Susp0 dom dat) -> Susps dom dat ts -> Susps dom dat ts
    Cons1 :: DSum Component (Susp1 t dom dat) -> Susps dom dat ts -> Susps dom dat (t:ts)

data Results (ts :: [Type]) where
    NilR :: Results '[]
    ConsR :: a -> Results ts -> Results (a : ts)

concatS :: Susps dom dat ts -> Susps dom dat us -> Susps dom dat (ts ++ us)
concatS = \case
    NilS -> id
    Cons0 x xs -> Cons0 x . concatS xs
    Cons1 x xs -> Cons1 x . concatS xs

runAddressing1
    :: (HiddenClockResetEnable dom)
    => Signal dom (Maybe addr)
    -> Addressing dom addr dat ts a
    -> Addressing0 dom addr dat
          ( Susps dom dat ts
          , a
          )
runAddressing1 addr = \case
    Return x -> nilS $ return x
    Bind m n -> do
        (rd1, x) <- runAddressing1 addr m
        (rd2, y) <- runAddressing1 addr $ n x
        return (concatS rd1 rd2, y)
    Fresh -> nilS $ Component typeRep <$> get <* modify succ
    WR -> nilS ask
    Match f body -> runAddressing1 (fmap (f =<<) addr) body
    Connect handle -> nilS $ do
        tell $ AddrMap $ DMap.singleton handle $ fanInMaybe addr -- TODO: block later connections
    Tell0 comp -> do
        return (Cons0 comp NilS, ())
    Tell1 comp -> do
        return (Cons1 comp NilS, ())
  where
    nilS act = do
        x <- act
        return (NilS, x)

runAddressing
    :: (HiddenClockResetEnable dom)
    => Signal dom (Maybe addr)
    -> Signal dom (Maybe dat)
    -> Addressing dom addr dat ts a
    -> ( Signal dom (Maybe dat)
       , a
       , Results ts
       )
runAddressing addr wr body = (join <$> firstIn rd, x, xs)
  where
    ((susps, x), conns) = evalRWS (runAddressing1 addr body) wr 0
    (rd, xs) = toRead susps (addrMap conns)

toRead
    :: forall dom dat ts. (HiddenClockResetEnable dom)
    => Susps dom dat ts
    -> DMap Component (FanIn dom)
    -> (FanIn dom (Maybe dat), Results ts)
toRead susps conns = go susps
  where
    go :: Susps dom dat us -> (FanIn dom (Maybe dat), Results us)
    go = \case
        NilS -> (mempty, NilR)
        Cons0 (h :=> (Susp0 mk)) ss ->
            let (rd', xs) = go ss
            in (mappend rd rd', xs)
          where
            addr = fromMaybe mempty $ DMap.lookup h conns
            rd0 = mk (firstIn addr)
            rd = gated (delay False $ isJust <$> firstIn addr) $ fanIn rd0
        Cons1 (h :=> (Susp1 mk)) ss ->
            let (rd', xs) = go ss
            in (mappend rd rd', ConsR x xs)
          where
            addr = fromMaybe mempty $ DMap.lookup h conns
            (rd0, x) = mk (firstIn addr)
            rd = gated (delay False $ isJust <$> firstIn addr) $ fanIn rd0

readWrite_
    :: (Typeable addr')
    => (Signal dom (Maybe addr') -> Signal dom (Maybe dat) -> (Signal dom (Maybe dat)))
    -> Addressing dom addr dat '[] (Component addr')
readWrite_ mkComponent = do
    handle <- Fresh
    wr <- WR
    Tell0 $ handle :=> Susp0(\addr -> mkComponent addr wr)
    return handle
  where
    return = Return
    (>>=) = Bind
    (=<<) = flip (>>=)
    m >> n = Bind m (const n)

romFromFile
    :: (HiddenClockResetEnable dom, 1 <= n, BitPack dat)
    => SNat n
    -> FilePath
    -> Addressing dom addr dat '[] (Component (Index n))
romFromFile size@SNat fileName = readWrite_ $ \addr wr ->
    fmap (Just . unpack) $ romFilePow2 fileName (maybe 0 bitCoerce <$> addr)

ram0
    :: (HiddenClockResetEnable dom, 1 <= n, NFDataX dat, Num dat)
    => SNat n
    -> Addressing dom addr dat '[] (Component (Index n))
ram0 size@SNat = readWrite_ $ \addr wr ->
    fmap Just $ blockRam1 ClearOnReset size 0 (fromMaybe 0 <$> addr) (liftA2 (,) <$> addr <*> wr)

matchAddr
    :: (addr -> Maybe addr')
    -> Addressing dom addr' dat ts a
    -> Addressing dom addr dat ts a
matchAddr = Match

from
    :: forall addr' s dom addr dat ts a. (Integral addr, Ord addr, Integral addr', Bounded addr')
    => addr
    -> Addressing dom addr' dat ts a
    -> Addressing dom addr dat ts a
from base = matchAddr $ \addr -> do
    guard $ addr >= base
    let offset = addr - base
    guard $ offset <= lim
    return $ fromIntegral offset
  where
    lim = fromIntegral (maxBound :: addr')

connect
    :: Component addr
    -> Addressing dom addr dat '[] ()
connect = Connect

fanInMaybe :: Signal dom (Maybe a) -> FanIn dom a
fanInMaybe = FanIn . Ap . fmap First

fanIn :: Signal dom a -> FanIn dom a
fanIn = fanInMaybe . fmap pure

firstIn :: FanIn dom a -> Signal dom (Maybe a)
firstIn = fmap getFirst . getAp . getFanIn

gated :: Signal dom Bool -> FanIn dom a -> FanIn dom a
gated p sig = fanInMaybe $ mux p (firstIn sig) (pure Nothing)

topEntity
    :: "CLK"   ::: Clock System
    -> "RESET" ::: Reset System
    -> "ADDR"  ::: Signal System (Maybe (Unsigned 16))
    -> "OUT"   ::: Signal System (Maybe (Unsigned 8))
topEntity clk rst = withClockResetEnable clk rst enableGen board

board
    :: (HiddenClockResetEnable dom)
    => Signal dom (Maybe (Unsigned 16))
    -> Signal dom (Maybe (Unsigned 8))
board addr = dataIn
  where
    wr = Just <$> cnt
      where
        cnt = register (0 :: Unsigned 8) $ cnt + 1

    (dataIn, (), NilR) = runAddressing addr wr $ do
        rom <- romFromFile (SNat @0x0800) "image.bin"
        ram <- ram0 (SNat @0x1800)

        from 0x0000 $ connect rom
        from 0x0800 $ connect ram
      where
        return = Return
        (>>=) = Bind
        (=<<) = flip (>>=)
        m >> n = m >>= \_ -> n

makeTopEntity 'topEntity
gergoerdi commented 3 years ago

With the whole graded monad thing removed:

{-# LANGUAGE DataKinds, GADTs, PolyKinds, KindSignatures, TypeOperators #-}
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE GeneralizedNewtypeDeriving, DerivingStrategies #-}
module Board where

import Clash.Prelude
import Clash.Annotations.TH

import Data.Kind
import Control.Monad.RWS hiding (Product)

import Data.Maybe
import Data.Kind
import Data.Dependent.Map as DMap
import Data.Dependent.Sum as DSum
import Data.GADT.Compare
import Type.Reflection

data Component (addr :: Type) = Component (TypeRep addr) Int

instance GEq Component where
    geq (Component a _) (Component b _) = geq a b

instance GCompare Component where
    gcompare (Component a _) (Component b _) = gcompare a b

newtype FanIn dom a = FanIn{ getFanIn :: Signal dom `Ap` First a }
    deriving newtype (Semigroup, Monoid)

newtype AddrMap dom = AddrMap{ addrMap :: DMap Component (FanIn dom) }
    deriving newtype (Monoid)

instance Semigroup (AddrMap dom) where
    AddrMap map1 <> AddrMap map2 = AddrMap $ unionWithKey (const mappend) map1 map2

newtype Susp0 dom dat addr = Susp0{ unSusp0 :: Signal dom (Maybe addr) -> Signal dom (Maybe dat) }
newtype Susp1 w dom dat addr = Susp1{ unSusp1 :: Signal dom (Maybe addr) -> (Signal dom (Maybe dat), w) }

type Addressing0 dom addr dat = RWS
    (Signal dom (Maybe dat))
    (AddrMap dom)
    Int

data Addressing dom addr dat a where
    Return :: a -> Addressing dom addr dat a
    Bind :: Addressing dom addr dat a -> (a -> Addressing dom addr dat b) -> Addressing dom addr dat b

    Fresh :: (Typeable addr') => Addressing dom addr dat (Component addr')
    WR :: Addressing dom addr dat (Signal dom (Maybe dat))
    Match :: (addr -> Maybe addr') -> Addressing dom addr' dat a -> Addressing dom addr dat a
    Connect :: Component addr -> Addressing dom addr dat ()
    Tell0 :: DSum Component (Susp0 dom dat) -> Addressing dom addr dat ()

instance Functor (Addressing dom addr dat) where
    fmap f m = m >>= \x -> return (f x)

instance Applicative (Addressing dom addr dat) where
    pure = Return
    ff <*> fx = ff >>= \f -> fmap f fx

instance Monad (Addressing dom addr dat) where
    return = Return
    (>>=) = Bind

type Susps dom dat = [DSum Component (Susp0 dom dat)]

runAddressing1
    :: (HiddenClockResetEnable dom)
    => Signal dom (Maybe addr)
    -> Addressing dom addr dat a
    -> Addressing0 dom addr dat
          ( Susps dom dat
          , a
          )
runAddressing1 addr = \case
    Return x -> nilS $ return x
    Bind m n -> do
        (rd1, x) <- runAddressing1 addr m
        (rd2, y) <- runAddressing1 addr $ n x
        return (rd1 <> rd2, y)
    Fresh -> nilS $ Component typeRep <$> get <* modify succ
    WR -> nilS ask
    Match f body -> runAddressing1 (fmap (f =<<) addr) body
    Connect handle -> nilS $ do
        tell $ AddrMap $ DMap.singleton handle $ fanInMaybe addr -- TODO: block later connections
    Tell0 comp -> do
        return ([comp], ())
  where
    nilS act = do
        x <- act
        return ([], x)

runAddressing
    :: (HiddenClockResetEnable dom)
    => Signal dom (Maybe addr)
    -> Signal dom (Maybe dat)
    -> Addressing dom addr dat a
    -> ( Signal dom (Maybe dat)
       , a
       )
runAddressing addr wr body = (join <$> firstIn rd, x)
  where
    ((susps, x), conns) = evalRWS (runAddressing1 addr body) wr 0
    rd = toRead susps (addrMap conns)

toRead
    :: forall dom dat ts. (HiddenClockResetEnable dom)
    => Susps dom dat
    -> DMap Component (FanIn dom)
    -> FanIn dom (Maybe dat)
toRead susps conns = go susps
  where
    go :: Susps dom dat -> FanIn dom (Maybe dat)
    go = \case
        [] -> mempty
        (h :=> (Susp0 mk)) : ss -> mappend rd $ go ss
          where
            addr = fromMaybe mempty $ DMap.lookup h conns
            rd0 = mk (firstIn addr)
            rd = gated (delay False $ isJust <$> firstIn addr) $ fanIn rd0

readWrite_
    :: (Typeable addr')
    => (Signal dom (Maybe addr') -> Signal dom (Maybe dat) -> (Signal dom (Maybe dat)))
    -> Addressing dom addr dat (Component addr')
readWrite_ mkComponent = do
    handle <- Fresh
    wr <- WR
    Tell0 $ handle :=> Susp0(\addr -> mkComponent addr wr)
    return handle

romFromFile
    :: (HiddenClockResetEnable dom, 1 <= n, BitPack dat)
    => SNat n
    -> FilePath
    -> Addressing dom addr dat (Component (Index n))
romFromFile size@SNat fileName = readWrite_ $ \addr wr ->
    fmap (Just . unpack) $ romFilePow2 fileName (maybe 0 bitCoerce <$> addr)

ram0
    :: (HiddenClockResetEnable dom, 1 <= n, NFDataX dat, Num dat)
    => SNat n
    -> Addressing dom addr dat (Component (Index n))
ram0 size@SNat = readWrite_ $ \addr wr ->
    fmap Just $ blockRam1 ClearOnReset size 0 (fromMaybe 0 <$> addr) (liftA2 (,) <$> addr <*> wr)

matchAddr
    :: (addr -> Maybe addr')
    -> Addressing dom addr' dat a
    -> Addressing dom addr dat a
matchAddr = Match

from
    :: forall addr' s dom addr dat a. (Integral addr, Ord addr, Integral addr', Bounded addr')
    => addr
    -> Addressing dom addr' dat a
    -> Addressing dom addr dat a
from base = matchAddr $ \addr -> do
    guard $ addr >= base
    let offset = addr - base
    guard $ offset <= lim
    return $ fromIntegral offset
  where
    lim = fromIntegral (maxBound :: addr')

connect
    :: Component addr
    -> Addressing dom addr dat ()
connect = Connect

fanInMaybe :: Signal dom (Maybe a) -> FanIn dom a
fanInMaybe = FanIn . Ap . fmap First

fanIn :: Signal dom a -> FanIn dom a
fanIn = fanInMaybe . fmap pure

firstIn :: FanIn dom a -> Signal dom (Maybe a)
firstIn = fmap getFirst . getAp . getFanIn

gated :: Signal dom Bool -> FanIn dom a -> FanIn dom a
gated p sig = fanInMaybe $ mux p (firstIn sig) (pure Nothing)

topEntity
    :: "CLK"   ::: Clock System
    -> "RESET" ::: Reset System
    -> "ADDR"  ::: Signal System (Maybe (Unsigned 16))
    -> "OUT"   ::: Signal System (Maybe (Unsigned 8))
topEntity clk rst = withClockResetEnable clk rst enableGen board

board
    :: (HiddenClockResetEnable dom)
    => Signal dom (Maybe (Unsigned 16))
    -> Signal dom (Maybe (Unsigned 8))
board addr = dataIn
  where
    wr = Just <$> cnt
      where
        cnt = register (0 :: Unsigned 8) $ cnt + 1

    (dataIn, ()) = runAddressing addr wr $ do
        rom <- romFromFile (SNat @0x0800) "image.bin"
        ram <- ram0 (SNat @0x1800)

        from 0x0000 $ connect rom
        from 0x0800 $ connect ram

makeTopEntity 'topEntity
gergoerdi commented 3 years ago

So to summarize, the above code, which defines dataIn via runAddressing which is recursive over a pure, finite data structure, causes Clash to take a comparatively very long time until it actually decides that it can't inline anymore. @christiaanb is this a small enough way of reproducing the problem?

gergoerdi commented 3 years ago

If I comment out rom from the definition of dataIn in board, I get to the inliner limit break message MUCH faster!

    (dataIn, ()) = runAddressing addr wr $ do
        ram <- ram0 (SNat @0x1800)

        from 0x0800 $ connect ram

But if I add back a second connect on ram, then we're back to slowness:

    (dataIn, ()) = runAddressing addr wr $ do
        ram <- ram0 (SNat @0x1800)

        from 0x0000 $ connect ram
        from 0x0800 $ connect ram
christiaanb commented 3 years ago

Thanks, I'll check it out.

christiaanb commented 3 years ago

Got my first interesting hint while compiling with -fclash-debug DebugSilent:

Irreducible constant as case subject: GHC.Types.I# 0
Can be reduced to: GHC.Types.I# 0

Where I'm actually forgetting why the GHC.Types.I# constructor is tagged as a primitive.

alex-mckenna commented 3 years ago

I'm actually forgetting why

GHC.Types is in ghc-prim, is it because it's originally a prim in the GHC core and we just convert it to a Clash core prim directly?

christiaanb commented 3 years ago

What's also weird is that there is an evaluator rule, but it's apparently not triggered. I changed it to:

  "GHC.Types.I#"
    | isSubj
    , [Lit (IntLiteral i)] <- args
    ->  let (_,tyView -> TyConApp intTcNm []) = splitFunForallTy ty
            (Just intTc) = lookupUniqMap intTcNm tcm
            [intDc] = tyConDataCons intTc
        in  reduce (mkApps (Data intDc) [Left (Literal (IntLiteral i))])
    | otherwise
    -> error (show args)

But it never throws the error.

alex-mckenna commented 3 years ago

That is strange, is the primitive evaluator being called at all?

christiaanb commented 3 years ago

Perhaps it would be if my cabal.project.local wouldn't have +experimental-evaluator in some places...

alex-mckenna commented 3 years ago

Oh yes, that would certainly ruin your day. At least you can delete those flags for good now given they don't exist in the PE branch

christiaanb commented 3 years ago

Alright, I've added some additional instrumentation so that Clash reports whether it's compile-time evaluator is missing a rewrite-rule for certain "primitives". The ones that are reported are:

No reduction rule for: "Clash.Promoted.Symbol.SSymbol" :: forall (s[6989586621679129497] :: GHC.Types.Symbol[3674937295934325066]).
GHC.TypeLits.KnownSymbol[3602879701896396843] s[6989586621679129497]
-> Clash.Promoted.Symbol.SSymbol[8214565720323787975] s[6989586621679129497]

No reduction rule for: "Data.Typeable.Internal.$wmkTrCon" :: forall (k[6989586621679127546] :: GHC.Prim.TYPE[3674937295934324912]
                                    GHC.Types.LiftedRep[3891110078048108766])
(a[6989586621679127547] :: k[6989586621679127546]).
GHC.Prim.Word#[3674937295934324854]
-> GHC.Prim.Word#[3674937295934324854]
-> GHC.Types.Module[3891110078048108670]
-> GHC.Types.TrName[3891110078048108676]
-> GHC.Prim.Int#[3674937295934324764]
-> GHC.Types.KindRep[3891110078048108688]
-> GHC.Types.[][3674937295934324788]
     Data.Typeable.Internal.SomeTypeRep[3674937295934325112]
-> GHC.Prim.(#,,,,#)[3819052484010180618]
     GHC.Types.WordRep[3891110078048108787]
     GHC.Types.WordRep[3891110078048108787]
     GHC.Types.LiftedRep[3891110078048108766]
     GHC.Types.LiftedRep[3891110078048108766]
     GHC.Types.LiftedRep[3891110078048108766]
     GHC.Prim.Word#[3674937295934324854]
     GHC.Prim.Word#[3674937295934324854]
     GHC.Types.TyCon[3891110078048108664]
     (GHC.Types.[][3674937295934324788]
        Data.Typeable.Internal.SomeTypeRep[3674937295934325112])
     (Data.Typeable.Internal.TypeRep[3674937295934325110]
        (GHC.Prim.TYPE[3674937295934324912] GHC.Types.LiftedRep[3891110078048108766])
        k[6989586621679127546])

The first one is an oversight on our end.

The second one is a bit more worrying, as that looks like some internal Typeable function to build up a TypeRep. We'll have to figure out how easy it is to recreate that function in our Core AST. That is, it looks to be the worker for https://hackage.haskell.org/package/base-4.14.0.0/docs/src/Data.Typeable.Internal.html#mkTrCon

And I suspect what then leads to the DMap in the AddrMap not being able to compiled away, because the comparison on TypeRep addr cannot be compiled away, because we're not getting any TypeRep addr data constructors out of the $wmkTrCon "primitive" to compare. And that then ultimately leads to infinite inlining of runAddressing1.

alex-mckenna commented 3 years ago

The first one is an oversight on our end.

https://github.com/clash-lang/clash-compiler/blob/a54c6b6d4f5aaf4e616468fa5f729f3d9e28fa6d/clash-ghc/src-ghc/Clash/GHC/PartialEval/Primitive/Promoted.hs#L36

Somehow I managed to implement this anyway, I must've seen it in my "missing primitive" warning message for PE. As for the second, I've also been seeing it in PE (and ignoring it) to seemingly no negative effect. Are you sure this is the problem?

EDIT: I don't actually see it in the log from clash-testsuite currently though, but I've definitely seen it before when running clash on things in the testsuite. Maybe it is a problem (but not the problem at the times I was seeing it)

christiaanb commented 3 years ago

No, I’m not sure it is the problem. My analysis is 100% speculation.

christiaanb commented 3 years ago

Okay, I made a somewhat smaller topEntity:

topEntity
  :: Clock System
     -> Reset System
     -> Signal System (Maybe (Unsigned 8))
     -> Signal System (Maybe (Unsigned 16))
     -> (Signal System (Maybe (Unsigned 16)), ())
topEntity clk rst addr wr =
    withClockResetEnable clk rst enableGen $ runAddressing addr wr $ do
        () <- return ()
        return ()

that also hits the inlining limit because there is no Core unfolding for GHC.Base.++:

No reduction rule for: "GHC.Base.++" :: forall (a[6989586621679125296] :: GHC.Prim.TYPE[3674937295934324912]
                                    GHC.Types.LiftedRep[3891110078048108766]).
GHC.Types.[][3674937295934324788]
  a[6989586621679125296]
-> GHC.Types.[][3674937295934324788]
     a[6989586621679125296]
-> GHC.Types.[][3674937295934324788]
     a[6989586621679125296]
Irreducible constant as case subject: GHC.Base.++
  @(Data.Dependent.Sum.DSum[8214565720323789109]
      (GHC.Prim.TYPE[3674937295934324912]
         GHC.Types.LiftedRep[3891110078048108766])
      Test.Component[8214565720323886481]
      (Test.Susp0[8214565720323886472]
         "System"
         (Clash.Sized.Internal.Unsigned.Unsigned[8214565720323788081]
            16)))
  <prefixName>"runAddressing_ds"
  <prefixName>"runAddressing1"
  (GHC.Types.[][3891110078048108577]
     @(Data.Dependent.Sum.DSum[8214565720323789109]
         (GHC.Prim.TYPE[3674937295934324912]
            GHC.Types.LiftedRep[3891110078048108766])
         Test.Component[8214565720323886481]
         (Test.Susp0[8214565720323886472]
            "System"
            (Clash.Sized.Internal.Unsigned.Unsigned[8214565720323788081]
               16))))
  <prefixName>"runAddressing_ds2"
  <prefixName>"runAddressing1"
  (GHC.Types.[][3891110078048108577]
     @(Data.Dependent.Sum.DSum[8214565720323789109]
         (GHC.Prim.TYPE[3674937295934324912]
            GHC.Types.LiftedRep[3891110078048108766])
         Test.Component[8214565720323886481]
         (Test.Susp0[8214565720323886472]
            "System"
            (Clash.Sized.Internal.Unsigned.Unsigned[8214565720323788081]
               16))))
Can be reduced to: GHC.Base.++
  @(Data.Dependent.Sum.DSum[8214565720323789109]
      (GHC.Prim.TYPE[3674937295934324912]
         GHC.Types.LiftedRep[3891110078048108766])
      Test.Component[8214565720323886481]
      (Test.Susp0[8214565720323886472]
         "System"
         (Clash.Sized.Internal.Unsigned.Unsigned[8214565720323788081]
            16)))
  (GHC.Types.[][3891110078048108577]
     @(Data.Dependent.Sum.DSum[8214565720323789109]
         (GHC.Prim.TYPE[3674937295934324912]
            GHC.Types.LiftedRep[3891110078048108766])
         Test.Component[8214565720323886481]
         (Test.Susp0[8214565720323886472]
            "System"
            (Clash.Sized.Internal.Unsigned.Unsigned[8214565720323788081]
               16))))
  (GHC.Types.[][3891110078048108577]
     @(Data.Dependent.Sum.DSum[8214565720323789109]
         (GHC.Prim.TYPE[3674937295934324912]
            GHC.Types.LiftedRep[3891110078048108766])
         Test.Component[8214565720323886481]
         (Test.Susp0[8214565720323886472]
            "System"
            (Clash.Sized.Internal.Unsigned.Unsigned[8214565720323788081]
               16))))
christiaanb commented 3 years ago

Adding:

appendSusps :: Susps dom dat -> Susps dom dat -> Susps dom dat
appendSusps []     ys = ys
appendSusps (x:xs) ys = x : appendSusps xs ys

and changing:

runAddressing1 addr = \case
    Return x -> nilS $ return x
    Bind m n -> do
        (rd1, x) <- runAddressing1 addr m
        (rd2, y) <- runAddressing1 addr $ n x
        return (appendSusps rd1 rd2, y)

makes

topEntity
  :: Clock System
     -> Reset System
     -> Signal System (Maybe (Unsigned 8))
     -> Signal System (Maybe (Unsigned 16))
     -> (Signal System (Maybe (Unsigned 16)), ())
topEntity clk rst addr wr =
    withClockResetEnable clk rst enableGen $ runAddressing addr wr $ do
        () <- return ()
        return ()

succeed

christiaanb commented 3 years ago

It does seem that we would eventually need $wmkTrCon, because when I change runAddressing to:

runAddressing addr wr body = (join <$> firstIn rd, x)
  where
    ((susps, x), conns) = evalRWS (runAddressing1 addr body) wr 0
    rd = toRead [(Component (typeRep @ Int) 0) :=> Susp0 (const (pure Nothing))] 
                (DMap.singleton (Component (typeRep @ Int) 0) (fanIn (pure 0)))

It finishes normalization, but I get the following netlist level error:

Test.hs:238:1: error:

    Clash.Netlist.BlackBox(291): No blackbox found for: Data.Typeable.Internal.$wmkTrCon. Did you forget to include directories containing primitives? You can use '-i/my/prim/dir' to achieve this.

    The source location of the error is not exact, only indicative, as it is acquired 
    after optimizations. The actual location of the error can be in a function that is 
    inlined. To prevent inlining of those functions, annotate them with a NOINLINE pragma.
    |
238 | topEntity clk rst addr wr =
    | ^^^^^^^^^
gergoerdi commented 3 years ago

I am compiling dependent-map with -fno-worker-wrapper, but of course Typeable itself comes from base so that's not helping...

I wonder if the correct solution wouldn't be to have Clash always compile everything with -fno-worker-wrapper and -fexpose-all-unfoldings and also present itself as a standalone Haskell compiler, i.e. not sharing any compiled objects/.hi files/.sos with GHC-compiled libraries. So as far as Cabal, Stack, etc. would be concerned, Clash would be just as independent of GHC as, e.g., Hugs or AJHC.

gergoerdi commented 3 years ago

Alright, I've added some additional instrumentation so that Clash reports whether it's compile-time evaluator is missing a rewrite-rule for certain "primitives".

Can you push that to a branch? I'd like to try it to the code in https://github.com/clash-lang/clash-compiler/issues/1536#issuecomment-741687988

christiaanb commented 3 years ago

It's https://github.com/clash-lang/clash-compiler/tree/prim_instrumentation, commit https://github.com/clash-lang/clash-compiler/commit/e7e161170899681a01316f00e892733e20fe62a9