Open gergoerdi opened 1 month ago
Even simpler:
{-# LANGUAGE BlockArguments, ViewPatterns, MultiWayIf, RecordWildCards, ApplicativeDo #-}
{-# LANGUAGE DerivingStrategies, DerivingVia, GeneralizedNewtypeDeriving #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE InstanceSigs, StandaloneDeriving #-}
{-# LANGUAGE CPP #-}
{-# OPTIONS -Wunused-binds -Wunused-imports #-}
{-# OPTIONS -fconstraint-solver-iterations=5 #-}
module Sudoku (topEntity) where
#define MATRIX_APPLICATIVE_COERCE 1
import Clash.Prelude hiding (lift)
import Clash.Annotations.TH
import Data.Functor.Compose
import Data.Coerce (coerce)
newtype Matrix n m a = FromRows{ matrixRows :: Vec n (Vec m a) }
deriving stock (Generic)
deriving newtype (NFDataX, BitPack)
deriving (Functor) via Compose (Vec n) (Vec m)
-- deriving (Applicative) via Compose (Vec n) (Vec m)
instance (KnownNat n, KnownNat m) => Applicative (Matrix n m) where
{-# INLINE pure #-}
pure :: forall a. a -> Matrix n m a
pure = coerce (pure @(Compose (Vec n) (Vec m)) @a)
{-# INLINE (<*>) #-}
(<*>) :: forall a b. Matrix n m (a -> b) -> Matrix n m a -> Matrix n m b
#if MATRIX_APPLICATIVE_COERCE
(<*>) = coerce $ (<*>) @(Compose (Vec n) (Vec m)) @a @b
#else
mf <*> mx = FromRows $ zipWith (<*>) (matrixRows mf) (matrixRows mx)
#endif
headMatrix :: forall n m a. (KnownNat n, KnownNat m, 1 <= n, 1 <= m) => Matrix n m a -> a
headMatrix = head @(m - 1) . head @(n - 1) . matrixRows
type Solvable n m = (KnownNat n, KnownNat m, 1 <= n, 1 <= m)
propagator
:: forall n m dom. (Solvable n m, HiddenClockResetEnable dom)
=> SNat n -> SNat m -> Signal dom ()
propagator _ _ = headMatrix . headMatrix . getCompose $ units
where
units :: Compose (Matrix n m) (Matrix m n) (Signal dom ())
units = pure unit
<*> pure (pure ())
<*> pure (pure ())
<*> pure (pure ())
<*> pure (pure ())
unit
:: Signal dom ()
-> Signal dom ()
-> Signal dom ()
-> Signal dom ()
-> Signal dom ()
unit _ _ _ _ = cell
where
cell = register () cell
topEntity
:: "CLK" ::: Clock System
-> "RESET" ::: Reset System
-> "OUT" ::: Signal System ()
topEntity clk rst = withClockResetEnable clk rst enableGen $
propagator d2 d2
makeTopEntity 'topEntity
More of an FYI: On GHC 9.10 both are equally fast. With the coerce
based version actually requiring fewer transformations.
On GHC 9.4, commenting out https://github.com/clash-lang/clash-compiler/blob/3b755b900810c727833875b1a15222d69d37029e/clash-prelude/src/Clash/Sized/Vector.hs#L326-L328 also makes the coerce
based version go through. Achieving a similar number of transformations, and similar speed, as we do on GHC 9.10.
Note that on GHC 9.10 the above rule never fires.
Something to check: perhaps it's due to loss of sharing in the reduceNonRepPrim
transformation.
I would love to move on GHC 9.10, but I'm out of the loop on where Clash is on GHC compatibility. I thought 9.4 or 9.6 was the last working version?
We have a README.md in the root of the repo with a section GHC compatibility. Both Clash 1.8 and master support GHC 9.8. In fact, for master we also run CI for GHC 9.10, so it has some support. Perhaps we just forgot to update that table to list GHC 9.10.
It would appear the READMEs on the 1.8 branch and the v1.8.1 tag are outdated; that's unfortunate. But the one on the master
branch is fairly reliable.
Ah, right, it was the Cabal/Stack issue of not picking up the Nat
plugins that was my issue earlier.
In the following Clash circuit, the
MATRIX_APPLICATIVE_COERCE
preprocessor flag enables anApplicative
instance for the includedMatrix
type that is equivalent to what one would get withderiving via (Compose (Vec n) (Vec m))
. With the flag disabled, thepure
implementation is still the same as the derived one would be, but(<*>)
is implemented directly asliftA2 (<*>)
.With
liftA2 (<*>)
, synthesis finishes in less than 4 seconds on my notebook. With thecoerce
-based implementation, it finishes in 1m46s. And this, of course, is a cut-down version of a much more complex circuit where the synthesis time goes from ~1 minute to something that I didn't even bother waiting out.