Closed KiaraGrouwstra closed 6 years ago
@tycho01 Thanks for the work (apologies for taking so long to get to this). Can you submit this to https://github.com/stites/gym-http-api ? I have enabled issue management on this fork. There is also an outstanding issue where the gym-http-api fell off of the stack nightly build (and is no longer in the LTSs). -- but I'm a little busy finishing up hasktorch atm.
@stites thanks! I tried making a PR, though the overlap turned out very high. Made one for the error as well.
I've tried some more stuff in my branch. It has a bit more (space types, factored to separate actors), but I made it to implement some agents for educational purposes, which could well exceed the intended scope of the openai repo.
On second thought, maybe I should have a closer look at reinforce
, which looks a lot closer to where I wanted to go.
Oh, I'd been looking at TF hs -- would you recommend hasktorch over that?
Reinforce is probably where you are headed but that repo definitely needs some love -- I've been a little preoccupied. Please submit work there and we can start the haskell reinforcement learning push! I was working on a bunch of simple general-purpose algorithms with hmatrix, but my work quickly outpaced the toy implementations fairly quickly and I have been back in hasktorch-land and python (haskell gets hard to debug in). The hope for reinforce is that it has a mix of haskell and python implementations to see side-by-side approaches, and that it can leverage more of the python tooling to help with debugging.
RE: tensorflow/haskell vs hasktorch. I'm writing the haddocks now, so it's pretty much ready for "alpha" testers. Hasktorch will provide you with dependently typed tensors (type-level dimensions), as well as a usable abstraction over the dynamically typed C-API. It leans a lot more on haskell-native tooling. To use the statically typed tensors, you wind up using singletons
-- but if you don't explicitly type your functions you can sidestep most of the hairy bits. It's definitely not as stable or complete as tensorflow -- but it's ready for users and the API is, IMO, more simple: we only use stupid simple monads, and have some basic NN functions written in backprop
which is wonderfully documented.
If you ping me on https://gitter.im with your email I can hook you up with slack access (and you should probably join the datahaskell group if you haven't already).
oh, seems TF's dependent types didn't really go through. I'll definitely check out hasktorch and hmatrix then -- small and simple is fine, TF does feel overwhelming.
I've yet to get to the ML debugging issues, definitely looking forward to those. :)
closing in favor of #57.