josiahls / fast-reinforcement-learning

Important Note fastrl version 2 is being developed at fastrl. Note the link in the readme
Apache License 2.0
38 stars 5 forks source link

Project collaboration #19

Open OtwellResearch opened 4 years ago

OtwellResearch commented 4 years ago

Is this project active? (I don't see any other way to message Josiah.) I've been thinking of working on something similar but would rather contribute to an existing project than start from scratch. But this one seems dormant...

josiahls commented 4 years ago

Hi!

Version 2 is being developed at https://github.com/josiahls/fast-reinforcement-learning-2, similar to how fastai is developing fastai2 in a different repo. I haven't uploaded a pip package quite yet / could do this weekend if the github actions works properly. A good reference for an rl arch library is https://github.com/Shmuma/ptan

Current Repo Arch Issues I found that the current fastrl does not have the flexibility to handle more complicated agents such as A3C. I think the current DataBunch is too large and clunky.

Turns out, having an Agent object is needed for an rl library. The nn.Module's are too dumb to properly plug into different environments and making them do so can make a neural net undesirably complicated, but an AgentLearner is too heavy to move around to different machines/robots.

Current Repo RAM Issues It also doesnt handle ram well. One of the primary goals was to allow a student/research reduce turn around time by being able to run rl projects in a notebook and easily get logs/videos. The issue the current repo has is it doesn't delete/filter logs well enough.

Possibly adding tensorboard as a permanent dependency could solve this, however I have also found openai has a Monitor wrapper that saves some logs to disk thus avoiding the ram issues altogether. Ultimately, I think storing log-like data in ram is a bad idea since that just adds an extra area for a memory leak to happen.

Development Issues Writing code then switching to a test dir to see if that code runs is extremely tedious and slow. I also found that if I wanted a doc website, this would be even slower. fastai2 is currently being developed using nbdev and I so far have had a great experience with it. I am also integrating fastcore into fastrl, hopefully to make the eventual transition to fastai2 less painful.

Near Term Goals I am starting with getting A3C working along with SAC. I plan to work backwards in terms of model complexity (A3C,DDPG,D4PG,A2C,TRPO,PPO->DQNs,REINFORCE). Hopefully that can make the arch more robust to different models.

As the repo stands, I am at the training A3C stage. I can execute learn.fit(10,lr=0.01,wd=1) and it will train to as far as I have A3C implemented.

Communication I could make a discord channel if it makes communication easier.

I am happy to make/add an documentation if you would like to contribute. Fair warning that I will likely be changing a lot of base line code and break things. Until merging into https://github.com/josiahls/fast-reinforcement-learning

OtwellResearch commented 4 years ago

Thanks, it sounds like you're on an intelligent path. I'm going to do some background work to better get up to speed with fastai2 before a deep dive into your existing frl2, so publish at your own schedule. I have been going back and forth deciding whether to deep dive into Tensorflow2 agents as my starting point just to have an active community to work with, but I greatly prefer Howard's approach in Pytorch - so given your progress and plans I think I will pursue this for a bit and see how it pans out.

A discord channel would be great - but no rush since I have a fair bit of homework to do.

I'm quite impressed with your comments above focusing on actual engineering and performance issues. For a teaser on my background, I built my first neural network from scratch in Smalltalk back in 1990 or so. The thinking was then that we should focus on modeling biological systems using "state of the art" object-oriented design to make it easy to radically redesign our models on the fly - which was a lot of fun - but the performance was abysmal. I just thought, with Moore's law and all... let the hardware catch up. I've learned a lot about reality since then, especially about what can be done today with relatively simple models given enough speed and data.

Now my thinking on performance is more inline with data-flow thinking driven by game-design needs. If there's any other software paradigm that pushes performance boundaries beyond the limits its triple-A gaming, and I have huge respect for the talent and lessons' learned by those developers. If I haven't totally lost you - you might want to take a look at https://youtu.be/g1TsP60z2OQ or https://www.youtube.com/watch?v=TH9VCN6UkyQ. I don't think we're ready to throw out Python and start over in deep learning with a new language, but that kind of thinking can influence design decisions if we consider them early in development they way you are doing. CUDA is doing a lot of this for us of course, but we can't let our abstractions fool us into thinking all areas of the system don't need that level of attention.

Cheers, for now. Ken

josiahls commented 4 years ago

I agree in the limitations of python, Definitely do what you think is best for your goals. At least for fastai, they seem to be looking at swift as a possible replacement (https://www.fast.ai/2019/03/06/fastai-swift/), which, based on my limited experience with swift, has been pretty good. It is interesting to note the desirability of a different design methodology than OOP per the second video you linked. Swift seems to use POP, which I'm not completely sure on the difference between that and DOD, or honestly smartly implemented OOP. Swift is also supported by tf2 which I find interesting.

If fastai completely shifts to swift (or some future language), I'll be perfectly happy to make sure fastrl shifts with it. (I'd be a little sad leaving python :( ) I hope that I have enough of an understanding of existing RL (and HRL) algorithms in the next 1-2 years to really want to develop on the cutting edge stuff that hasnt even made it fully to pytorch or tensorflow (research by https://numenta.com/ is of interest but in C++ for example).

I'll make a discord soon, maybe after A3C and SAC at least. I'm happy to continue the convo there :)

OtwellResearch commented 4 years ago

I've programmed professionally in FORTRAN, Lisp, Smalltalk, PHP, Visual Basic, C, C#, Java, JavaScript, Typescript, and now Python... so moving to Swift is just a blip if that becomes necessary. Smalltalk will always be my favorite from a language standpoint, but the version and build control was abysmal. Anyway... I'm a little slow getting up to speed here because of some personal issues this last week, but hopefully that's passed.

My goal is to integrate the different DL technologies into a more general AI capability, but we need absolutely rock-solid DL "primitives" like Fastai provides to make that feasible, and FastRL is a major missing piece.

OtwellResearch commented 4 years ago

BTW, thanks so much for the Numenta link - I just watched their Microsoft presentation. I think they're spot on in their approach. I did some work in coarse coding analysis back in the 80's for using different numbers of nodes to represent concepts with the dimensions being number of distinct concepts vs number that can be represented simultaneously - and if you set the cross talk threshold between concepts at 1 percent or so, you can get a very large number of simultaneous concepts active in memory. Of course the maximum number of distinct concepts that can be represented is when every node is used in the representation but then only one concept at a time, and the max number of distinct concepts that can be unambiguously represented in parallel is when only one node is used per concept however only n number of concepts can be represented at all with n being the number of nodes. But allowing some cross-talk between randomly-selected subsets of various sizes provides for shared nodes much like we find in word embeddings and provides the best of both worlds. I only did a simple analysis but never implemented anything - I'm thrilled to see this kind of work being pursued now.

OtwellResearch commented 4 years ago

Looking deeper into Swift for TensorFlow and how well its integrated with Python - I'm leaning into skipping fastai2... or perhaps wrapping the naughty bits into Swift using the native Python import. hmmm...

edit: seems like the Swift Tensorflow development has stalled and Jeremy isn't even working on it now. Oh well. p.s., I'm "Jumonji" on the Fastai discord.

josiahls commented 4 years ago

Sorry for the late response.

I agree! I like their brain first, theory based approach! It's definitely cool that you have worked for a while in the area of concepts, and trying to get computers to make connections between them. This is going to be important for RL in general I think as opposed to the model-free approach.

I checked your links on the discord about s4tf (mines jokellum). Pretty disappointing, however I wasn't expecting a serious shift for a few years anyways. It seems for now, we are still going to be searching for a language that can provide low level code but a high level API :/ . I still like swift as a candidate language. For now it seems like python is still king here. I anticipate python becoming a road block at least for fastrl in about 2 years at least for me. After that I would be needing to look into c++ and cuda with python bindings...

I would also want to wait for some nbdev alternative to be available for whatever language that fastai transitions too lol. I like nbdev too much to move back to traditional dev.

OtwellResearch commented 4 years ago

It looks like the Swift version is not moving as fast as i would have thought after Jeremy's endorsement last year - but its still in active development. Also the Julia discussion on Discord shows that the progress there to integrate Fastai2 is much further along than I expected. For now, I'm just gonna punt and focus on the Python version while those two futures battle it out.

OtwellResearch commented 4 years ago

Any progress updating to fastai2?

josiahls commented 3 years ago

Hi, I hope your research is going well! I am just adding a courtesy update here: I recently added DIAYN HRL. I plan to add DADS, then I am going to do a massive refactor of fastrl 2 in Feb. fastrl 2 models train fast enough in terms of batches, however pure computation are extremely innefficient due to me trying things / adding code that "just makes things work enough".

This refactor's 2 goals are:

Speed wise, I plan to use pytorch dictionaries as the fastrl 2.0 primary primitive. The openai spinning up library also uses this.