Open cpennington opened 7 years ago
Cool, interesting stuff.
It's not really surprising that there's not a big gain there on a cpu, those bits of c vectorise well, so one wouldn't expect to be able to get much of a gain there.
I'll have to install it and have a play.
Oops. I just noticed some of my false starts made it into the PR, I'll take those out.
My overall plan was to make a new typeclass to capture that a type can be run under accelerate, and then to implement that class for all layer types. I'll try and get the class fleshed out so that others can add instances.
Currently, this is building on #36, and needs to be rebased/squashed. But, it now has a typeclass for Network
backed by Accelerate
, and an instance of that typeclass for FullyConnected
.
So, I'm a bit stuck, at the moment. I'm trying to implement the Network
instance of Accelerable
, the typeclass I created to represent things that can be turned into an Accelerate compatible form. However, something about the way it's set up is giving me:
/home/cpennington/personal/grenade/test/Test/Grenade/Network/Accelerate.hs:45:16: error:
• Could not deduce: Head ashapes0 ~ DIM1
from the context: (Head shapes ~ 'D1 len,
GHC.TypeLits.KnownNat len)
bound by a pattern with constructor:
S1D :: forall (len :: ghc-prim-0.5.0.0:GHC.Types.Nat).
GHC.TypeLits.KnownNat len =>
hmatrix-0.18.0.0:Internal.Static.R len -> G.S ('D1 len),
in a case alternative
at test/Test/Grenade/Network/Accelerate.hs:45:7-11
Expected type: Acc (Array (Head ashapes0) Double)
Actual type: Acc (Vector Double)
The type variable ‘ashapes0’ is ambiguous
• In the expression: use $ fromVector v
In a case alternative: S1D v -> use $ fromVector v
In the expression: case input of { S1D v -> use $ fromVector v }
• Relevant bindings include
inputA :: Acc (Array (Head ashapes0) Double)
(bound at test/Test/Grenade/Network/Accelerate.hs:44:5)
targetA :: Acc (Array (Last ashapes0) Double)
(bound at test/Test/Grenade/Network/Accelerate.hs:47:5)
testedA :: Acc (Array (Head ashapes0) Double)
(bound at test/Test/Grenade/Network/Accelerate.hs:50:5)
My guess is that the typeclass (or the Network instance of it), insufficiently describes the connection between the standard grenade shapes 'D1
, 'D2
, and 'D3
and the Accelerate ones DIM1
, DIM2
, and DIM3
. Any suggestions on how to structure things better so that GHC can understand the mapping of Network into Accelerate?
Yep, no worries.
I will attempt to find some time this weekend to have a look into helping you out.
I made some good progress in the last day or so. I'll push the results soon.
I pushed up my latest changes. I thought that it was pretty close, but I think there are actually just type errors obscuring more type errors. I think the next step may be to create a version of genNetwork
that generates Accelerable
networks (for the moment, only FullyConnected
), and then test that the accelerated and non-accelerated versions give the same output.
In this latest set of changes, I'm running into trouble because I can't figure out how to make sure that genShape
gives me a shape that the compiler knows fits the constraint Accelerable (S final)
. Maybe the unsafeCoerce
code would do the trick, but I'd love some pointers.
@cpennington ooh, this is interesting! What is the status of this? Also, you may be interested in https://github.com/tmcdonell/accelerate-blas
@tmcdonell: It's working, in general. I think it's at a point where other people could start adding Accelerable
instances for the other layer types, and also for the recurrent networks. I think those implementations could happen as PRs against this PR, or if @HuwCampbell is comfortable merging this one as-is, they could happen as PRs against the main repo.
(Looking back, it also looks like I still need to fill out the implementation of runGradient
and runNetwork
in the Network.Accelerate
package)
After more investigation, it appears I don't understand data families as well as I thought I did. In particular, I can't figure out how to write runNetworks
in Grenade.Core.Accelerate
, because ANetwork
and ANil
are different types, rather than being different constructors for the same type.
I've gotten past the GADT hurdle. Now I have a non-terminating type reduction... Types are fun/hard...
I suspect it's because I'm using Tail layers
and Tail shapes
in the instance definition of Accelerable (G.Network layers shapes)
, but I can't split up the instance definition in to recursive and base cases because then I would end up with different types for data Accelerated (G.Network layers shapes)
. @HuwCampbell, have any suggestions?
For im2col and col2im, accelerate doesn't seem to improve benchmarks (using the llvm-native backend). It's not clear to me if that's because my implementation is bad, or because there's a bunch of overhead to using llvm that doesn't exist in the existing C implementations.
My next priorities will be to implement the parts of grenade that I need in my main project using accelerate, to see if it improves the benchmarks in that project.