RatInABox2.0 - Opening the discussion

TomGeorge1234 commented 1 year ago

I've begun to think about 2.0. The reason is that there are are certainly a couple of choices I made early on in development which weren't optimal. Now could be a good time to fix these as the community is growing but still small enough it won't be super disruptive. Also fixing them will make it easier to maintain RiaB in the long run.

I'm opening this issue to get community thoughts on this. @SynapticSage @colleenjg @jquinnlee @mehulrastogi you're some of the most active users I know fairly well so I'm tagging you to get your input (if you have any), but anyone can chip in here. Here's my thoughts:

Essential and backwards incompatible changes (do first):

[ ] Refactoring: As discussed in #58 #55. E.g. it's not nice having all Neurons classes in one .py file.
[ ] Args not Dicts: It's increasingly annoying me that parameters are always handed in as dicts. This is unconventional and has warranted very-well-made but hacky work-arounds e.g. #38 #39
[ ] Global Environment update(): Given, now, Environments know about their Agents and Agents know about their Neurons we could have just one update function in Env which cascades through else thing else. Cleaner?
[ ] Rename dev --> main
[ ] Environment stores the global clock. This just makes sense imo.
[ ] Better policy API - I don't love the drift_velocity kwarg. Maybe instead Agents can have a policy() method which returns a drift - this would default to the random motion policy, unifying that too. Just something to consider.

Other essential changes

[ ] Type hinting: This is a new thing in python which I've been told to consider. Any thoughts?
[ ] Modularity Break down some of the larger update() perhaps adding into new agent/neuron/env specific utils scripts.
[ ] Dynamic environments Environments can change by adding walls and objects but we should formalise this with setters which, whenever called, save the "state" of the environment alongside a timestamp as a dictionary to an Env.history dictionary. Then, when plotting / animating the environment we can pass in a time argument and the correct state can be retrieved and plotted. The state of the environment only appends to history whenever it changes (e.g. a setter is called).
- Related to the above, if an Environment changes half way through a simulation then animations will not support this since they always replot the last environment which is both wasteful and possibly wrong. To get around this whenever you call plot_environment() it can be passed a fig an ax and a new object which is a list/dict of plot objects, R which are all matplotlib.Artists already existing on the figure. The environment can store an equivalent list of plot objects and whenever this changes (e.g. a wall is added or an object is moved etc.) this change is logged then plotting can (i) get the list of plot objects corresponding to the correct time and (ii) compared it to the passed list, if they aren't equal then repot the env, otherwise don't bother. Something like that.
- Alternatively (maybe better): Environments have an Env.history dictionary storing the full "state" of the environment (all object locations, walls, boundaries, etc.). Then Env.plot_environment() takes a time argument and find the state of the at that time and plots that.
[ ] Plotting: To me at least the visualisation ability of RiaB is really important but animations are slow and I like animating things so this annoying. Could improve by being smarter about how we render stuff in matplotlib, and not re-rendering the Environment or trajectories each frame. Stuff like that. See #54
- For example if this Environment state dictionary was stored inside the figure itself with some kind of hash code we could just check on each call whether the desired state matches the state thats been plotted. Only replot if they aren't equal.
[ ] Only pass ax not fig to figure plotting functions. This may throw up some things but likely minor.
[ ] Break up utils.py into separate ones for the Agent package, Neurons package and Env package and maybe also a misc.
[ ] Documentation: Would be great to have a sustainable ReadTheDocs page. We should think about how to structure doc strings so they are all uniform. #36
[ ] Unit testing: I have been pretty sloppy about this but will add loads more.
[ ] Testing on PRs Run RiaB tests, test doc strings and text styling.
[ ] **Move this to RatInABox/RatInABox not TomGeorge1234/RatInABox
- RatInABox/RatInABox_RL** package containing all the RL stuff (Actor, Critic, ValueNeuron, TDError, TaskEnv etc.)
[ ] IntermediateNeurons subclass for neurons which aren't "fundamental" but take other neurons as inputs. Current examples are FeedForwardLayer and NeuralNetworkNeurons
[ ] DynamicNeurons subclass for neurons which aren't static i.e. you can't call Neurons.plot_rate_map() because they actually depend on the past history. Examples include TDErrorNeurons (to be made) or anything with recurrency.
[ ] **SmoothRandomFeatureNeurons just some spatially tuned but random neurons. Users just provide a length scale. Would be useful for a lot of feature learning studies. Probably something like a gaussian process underlying these neurons.

Things to consider

[ ] Neurons should follow torch.nn.module API - this would make more efficient the evaluation of complex feedforward graphs which currently happens in a backwards manner. This might require renaming the .get_state() method with .forward(). Need to think more about this
[ ] conda Once all of the above is done it would be nice to publish this on the condo-forge channel.
[ ] Jax compatibility: Very on the fence about this one. Probably leaning towards not doing it. Would be great to have speed ups, autograd and gpu capacity but it could be just a bit too much / unnecessary / off-putting for non-python geeks (tbh, like me). But if jax is the future I want to consider it. Options include:
- Don't do it
- Partial jax to hit a few heavy-lifting utils functions. Q: Does this even work, would converting to/from jax arrays not be inconveniently slow here?
- Full jax no numpy. np--> jnp everywhere.
- Both jax and numpy. Users choose which backend. This should hard but I've played around and probably could be done. Has complications though.

I'm not a software guy so @SynapticSage @mehulrastogi feel free to give high level comments about best way to go forward.

SynapticSage commented 1 year ago

args, not dicts 👍

A helpful case study in support of args ...

Most of you (I'm sure) have seen Grant Sanderson's beautiful 3blue1brown YouTube channel. Grant impressively homebrewed the manim package that creates his stunning math videos.

Similar to here, manim started out using a CONFIG dict. On a positive note, the CONFIG dict cut down on lines in object init; encouraged people to spell out settings in one place. But on the dark side, dicts required nearly re-coding a lot of Python features handled by kwargs and setting attributes. Ultimately the community fork decided to kill CONFIG dicts in favour of args -- decision convo here: https://github.com/ManimCommunity/manim/pull/763

Grant Sanderson's fork is also trying to remove them: https://github.com/3b1b/manim/pull/1932

plotting 👍

💯 replotting = slow.

... if ratinabox caches plot objects, super recommend scheme we chatted about: https://github.com/TomGeorge1234/RatInABox/issues/30#issuecomment-1486449726

The TaskEnvironment has a weak version of this feature -- doesn't replot everything and thus renders quickly. But it's pretty hacky in my view that the environment caches things about its agents and goals. In the long-run, it will be more maintainable to have each class in charge of caching its own plot objects rather than having to change master supervisor class's plot every time the children classes change.

type hinting 👍

Especially easy-to-type variables.

Tools like jedi and language-server-protocol offer better code completion for type-hinted variables.

unit testing 👍

global environment 👍

Possible suggestion: each RIB class could have a list of children (environment.childen = [agent, ...]; agent.children=[neuron,...]) to unify the way .update() and .plot()/.render() calls cascade down a hierarchy. It may be more uniform than each object having a different attribute name for its children.

Jax 🤷‍♂️

No strong opinions. Leaning partial Jax if the penalty for binary-op/shuttling numpy to a CPU jax.device is low.

colleenjg commented 1 year ago

Sounds like a great idea overall for the longevity of the package! I definitely agree for the args instead of dicts, type hinting and unit testing. For global environment, if the cascading update is implemented, I would suggest having a kwarg like cascade=True, to allow users to opt out, when needed. No strong views on the other sections.

I would suggest an additional section: modularity: Many of the classes have very long methods that chain a lot of complex, separate computations together. When I've created new classes for my own use, e.g., new Agent classes, I've had to copy long sections of certain methods that I needed to overwrite, but only partially (for example, for computing an agent's velocity). This can create a lot of code duplication (I think there may already be some for the plotting methods). So, I strongly recommend adding the goal of modularization to the list, i.e., extracting meaningful subparts of class methods and turning them into their own functions, perhaps aggregated into agent_util.py, env_util.py and neuron_util.py, or something like that.

TomGeorge1234 commented 1 year ago

Great comments, thanks guys. @SynapticSage 3B1B advice heeded! @colleenjg you're right this could be more modular, for example Agent.update() is pretty enormous. Breaking these down would make sense so I'll look to do that. Don't expect this anytime soon btw so any new ideas, keep posting them here.

jquinnlee commented 1 year ago

These all sound like great changes for RAIB 2.0, and I agree w all of the comments from @SynapticSage and @colleenjg :)

I'm a particularly big fan of the global environment updating, as this seems much more concise. My only concern is whether this would slow down updates for really long simulations (like the ones I have been running, e.g. @ 30 Hz x 31 sessions x 40 min/session). It might be ideal to perform more selective updates and skip others if they are going to be static using some sort of argument in update()?

As far as Jax compatibility, I would be very much for this if it can actually speed things up for the heavier computations and long simulations, but as you point out it might not save compute time if large arrays are being converted often. I believe it would be worth some case testing in a couple of large simulations before ruling this out.

TomGeorge1234 commented 1 year ago

Thanks for the feedback, closing for now.

colleenjg commented 1 year ago

One thing that just occurred to me, which could be considered:

Only passing ax to the plotting functions, not fig.

In typical use cases, to my knowledge, passing both should be redundant, as you can access the figure with ax.figure (or ax.ravel()[0].figure in cases where ax is an array).

TomGeorge1234 commented 1 year ago

Agreed and added to the list. It's essentially redundant and only add bloat

musicinmybrain commented 1 year ago

If you add jax support, would it be possible to do it through an optional extra for opportunistic speed-ups rather than as a mandatory dependency?

As the primary maintainer of the Fedora Linux package for this project, I’m not sure if packaging https://github.com/google/jax would be feasible for us or not. While it does look like jax can be built without support for the proprietary CUDA SDK, it’s still a pretty gnarly stack when taken together with https://github.com/openxla/xla, and I’m not sure whether or not an attempt to package it would end up hitting a hard requirement on something nonfree.

TomGeorge1234 commented 1 year ago

@musicinmybrain thanks for your feedback - that's ok, I doubt we'd go full jax. In fact leaning towards no jax at all actually. After some preliminary testing seems like getting significant speed ups would be difficult as most of the heavy computations are already vectorised

RatInABox-Lab / RatInABox