mikeizbicki / HLearn

Homomorphic machine learning
Other
1.62k stars 138 forks source link

A couple of questions #81

Closed mrkkrp closed 8 years ago

mrkkrp commented 8 years ago

I'm working on a tutorial about machine learning and my intention is to use HLearn in the tutorial (there are not so many options).

A few things I would like to ask:

mrkkrp commented 8 years ago

I'm just trying to understand how to present this library to (professional) Haskell developers. Right now I'm reading your blog posts and playing with now deprecated packages from Hackage.

mrkkrp commented 8 years ago

I see there is a 2.0.0.0 tag, why can't we have this version on Hackage and Stackage?

mikeizbicki commented 8 years ago

I would very much welcome more documentation for HLearn, but I don't think a blog post is a good way to go about it. The interface is not very stable yet, and so I'd worry that the information on the blog would go out of date very quickly.

In fact, one of the reasons HLearn is not on Hackage is because I don't think it's ready for "production use". Things keep changing a lot, and I don't want people to start depending on a certain interface.

Probably the best way to contribute documentation would be to take one of the examples in the examples folder and add explanations of what's going on.

mrkkrp commented 8 years ago

OK, then perphaps I will go with current master branch and tell readers that this is not entirely stable. When you get more stable API I'll review the tutorial and update it.

mrkkrp commented 8 years ago

I'm having troubles building the project with stack:

stack build
While constructing the BuildPlan the following exceptions were encountered:

--  Failure when adding dependencies:
      subhask: needed (==0.1.1.0), couldn't resolve its dependencies
    needed for package HLearn-2.0.1.0

--  Failure when adding dependencies:
      MonadRandom: needed (==0.4), 0.4.2.3 found (latest applicable is 0.4)
      approximate: needed (==0.2.2.1), 0.2.2.3 found (latest applicable is 0.2.2.1)
      bytes: needed (==0.15.0.1), 0.15.2 found (latest applicable is 0.15.0.1)
      cassava: needed (==0.4.3.1), 0.4.5.0 found (latest applicable is 0.4.3.1)
      hmatrix: needed (==0.16.1.5), 0.17.0.1 found (latest applicable is 0.16.1.5)
      hyperloglog: needed (==0.3.4), 0.4.0.4 found (latest applicable is 0.3.4)
      lens: needed (==4.12.3), 4.13 found (latest applicable is 4.12.3)
      parallel: needed (==3.2.0.6), 3.2.1.0 found (latest applicable is 3.2.0.6)
      primitive: needed (==0.6), 0.6.1.0 found (latest applicable is 0.6)
      semigroups: needed (==0.16.2.2), 0.18.1 found (latest applicable is 0.16.2.2)
      vector: needed (==0.10.12.3), 0.11.0.0 found (latest applicable is 0.10.12.3)
    needed for package subhask-0.1.1.0

Dependency version bounds could probably more flexible. I can open a PR for that, what do you think?

mrkkrp commented 8 years ago

See your comment about reproducible builds, but with stack it's not a problem anymore.

mrkkrp commented 8 years ago

I'll perhaps suspend writing the tutorial until it's easy to install the library and play with it. I failed to make SubHask work with GHC 7.10.3, most readers will likely not survive the “installation” section.

mikeizbicki commented 8 years ago

@mrkkrp Thanks for the feedback. Easier installations is definitely something I need to work on.

mrkkrp commented 8 years ago

In my tutorial, I want to touch ideas described here, but I don't see anything similar is current master branch. There is no Categorical type, no train function. What should I use?

mikeizbicki commented 8 years ago

There are currently no probability distributions implemented in HLearn because doing this properly requires better support for numerical operations than currently exists in Haskell. When the subhask project gets to a point where the required numerical support exists, then distributions will be added back in and things similar to the blog post will be possible again.

mrkkrp commented 8 years ago

OK, is there anything I can use to show how Functor, Monad, and Monoid instances work?

mrkkrp commented 8 years ago

Also, do you have an estimation when the library will be ready for release?

mikeizbicki commented 8 years ago

This is a good starting point: https://github.com/mikeizbicki/subhask/blob/master/examples/example0002-monad-instances-for-set.lhs Actually, a tutorial on subhask would be a much easier task at this point, and I think you'll find all of the ideas you've mentioned so far there as well.

HLearn definitely won't be the library I want it to be for at least a year, but there may be some releases along the way. Once there's a reasonable framework for numerical computing (i.e. once subhask is complete), then finishing HLearn will be very easy. Until then, it's not worth the time doing workarounds in hlearn that are just going to be reverted later.