HuwCampbell / grenade

Deep Learning in Haskell
BSD 2-Clause "Simplified" License
1.45k stars 84 forks source link

Implement pieces of grenade using Accelerate #38

Open cpennington opened 7 years ago

cpennington commented 7 years ago

For im2col and col2im, accelerate doesn't seem to improve benchmarks (using the llvm-native backend). It's not clear to me if that's because my implementation is bad, or because there's a bunch of overhead to using llvm that doesn't exist in the existing C implementations.

My next priorities will be to implement the parts of grenade that I need in my main project using accelerate, to see if it improves the benchmarks in that project.

HuwCampbell commented 7 years ago

Cool, interesting stuff.

It's not really surprising that there's not a big gain there on a cpu, those bits of c vectorise well, so one wouldn't expect to be able to get much of a gain there.

I'll have to install it and have a play.

cpennington commented 7 years ago

Oops. I just noticed some of my false starts made it into the PR, I'll take those out.

My overall plan was to make a new typeclass to capture that a type can be run under accelerate, and then to implement that class for all layer types. I'll try and get the class fleshed out so that others can add instances.

cpennington commented 7 years ago

Currently, this is building on #36, and needs to be rebased/squashed. But, it now has a typeclass for Network backed by Accelerate, and an instance of that typeclass for FullyConnected.

cpennington commented 7 years ago

So, I'm a bit stuck, at the moment. I'm trying to implement the Network instance of Accelerable, the typeclass I created to represent things that can be turned into an Accelerate compatible form. However, something about the way it's set up is giving me:

/home/cpennington/personal/grenade/test/Test/Grenade/Network/Accelerate.hs:45:16: error:
    • Could not deduce: Head ashapes0 ~ DIM1
      from the context: (Head shapes ~ 'D1 len,
                         GHC.TypeLits.KnownNat len)
        bound by a pattern with constructor:
                   S1D :: forall (len :: ghc-prim-0.5.0.0:GHC.Types.Nat).
                          GHC.TypeLits.KnownNat len =>
                          hmatrix-0.18.0.0:Internal.Static.R len -> G.S ('D1 len),
                 in a case alternative
        at test/Test/Grenade/Network/Accelerate.hs:45:7-11
      Expected type: Acc (Array (Head ashapes0) Double)
        Actual type: Acc (Vector Double)
      The type variable ‘ashapes0’ is ambiguous
    • In the expression: use $ fromVector v
      In a case alternative: S1D v -> use $ fromVector v
      In the expression: case input of { S1D v -> use $ fromVector v }
    • Relevant bindings include
        inputA :: Acc (Array (Head ashapes0) Double)
          (bound at test/Test/Grenade/Network/Accelerate.hs:44:5)
        targetA :: Acc (Array (Last ashapes0) Double)
          (bound at test/Test/Grenade/Network/Accelerate.hs:47:5)
        testedA :: Acc (Array (Head ashapes0) Double)
          (bound at test/Test/Grenade/Network/Accelerate.hs:50:5)

My guess is that the typeclass (or the Network instance of it), insufficiently describes the connection between the standard grenade shapes 'D1, 'D2, and 'D3 and the Accelerate ones DIM1, DIM2, and DIM3. Any suggestions on how to structure things better so that GHC can understand the mapping of Network into Accelerate?

HuwCampbell commented 7 years ago

Yep, no worries.

I will attempt to find some time this weekend to have a look into helping you out.

cpennington commented 7 years ago

I made some good progress in the last day or so. I'll push the results soon.

cpennington commented 7 years ago

I pushed up my latest changes. I thought that it was pretty close, but I think there are actually just type errors obscuring more type errors. I think the next step may be to create a version of genNetwork that generates Accelerable networks (for the moment, only FullyConnected), and then test that the accelerated and non-accelerated versions give the same output.

cpennington commented 7 years ago

In this latest set of changes, I'm running into trouble because I can't figure out how to make sure that genShape gives me a shape that the compiler knows fits the constraint Accelerable (S final). Maybe the unsafeCoerce code would do the trick, but I'd love some pointers.

tmcdonell commented 7 years ago

@cpennington ooh, this is interesting! What is the status of this? Also, you may be interested in https://github.com/tmcdonell/accelerate-blas

cpennington commented 7 years ago

@tmcdonell: It's working, in general. I think it's at a point where other people could start adding Accelerable instances for the other layer types, and also for the recurrent networks. I think those implementations could happen as PRs against this PR, or if @HuwCampbell is comfortable merging this one as-is, they could happen as PRs against the main repo.

(Looking back, it also looks like I still need to fill out the implementation of runGradient and runNetwork in the Network.Accelerate package)

cpennington commented 7 years ago

After more investigation, it appears I don't understand data families as well as I thought I did. In particular, I can't figure out how to write runNetworks in Grenade.Core.Accelerate, because ANetwork and ANil are different types, rather than being different constructors for the same type.

cpennington commented 7 years ago

I've gotten past the GADT hurdle. Now I have a non-terminating type reduction... Types are fun/hard...

I suspect it's because I'm using Tail layers and Tail shapes in the instance definition of Accelerable (G.Network layers shapes), but I can't split up the instance definition in to recursive and base cases because then I would end up with different types for data Accelerated (G.Network layers shapes). @HuwCampbell, have any suggestions?