mikeizbicki / HLearn

Homomorphic machine learning
Other
1.62k stars 138 forks source link

Interested in working on HLearn #66

Open kjameslubin opened 8 years ago

kjameslubin commented 8 years ago

Please email me at the address listed on my (out of date - but perhaps that's a redundant qualifier) academic page: [ https://math.berkeley.edu/~ksjames/ ]. It'd be a weekend project for me but it's worth a discussion at least

kjameslubin commented 8 years ago

Bump, since I saw a reply on #70

mikeizbicki commented 8 years ago

Sorry, I didn't see this issue before for some reason. If you can say a bit more about what you're interested in, I can help you get started. Unfortunately, I doubt there's much that can be accomplished in just a weekend project.

kjameslubin commented 8 years ago

I suppose I meant "recurring weekends".

For one, I've had some trouble building (see #67), so that would be start.

I am a contributor to BIDMach which is a very fast GPU based ML toolkit written in Scala. I think I might be of some help on the architectural side as I have learned a lot of practical lessons from understanding BIDMach and also have some background in haskell design (and know a bit category theory from my mathematical physics research).

I'm also interested in implementing specific algorithms, and potentially working on a GPU backend.

mikeizbicki commented 8 years ago

GPU work is something that I've been very interested in, but haven't really looked at yet at all. I think the way the linear algebra stuff in subhask is structured would be ammenable to a GPU backend. Currently, there's three basic vector types (as in vector space, not array) supported: BVector is a boxed vector is slow but can use any numeric type; SVector/UVector are storable and unboxed vectors that are fast and exist in main memory. Each of these vectors takes a size parameter (which can be phantom so you don't need to know the size at compile time) and a type parameter that is the underlying numeric type (usually Float/Double).

I think the basic approach would be to add something like a GPUVector that gets allocated on the GPU. Then there could be a function like toGPU :: SVector s Float -> GPUVector s Float that moves the data to the GPU. And the numeric hierarchy would get implemented via GPU computations.