BaseAgent (and Learner / helper methods)

Discussion We have a base agent for representing a unified set of fields and methods that every agent will need.

Goals:

[x] pick_action seems to be too simple. During DQN and other agent optimization, we might need to also pick actions, but we might not want exploration and want gradients. Should this be split?
[x] At present AgentLearner and BaseAgent seem a little similar. I think that soon we will need to evaluate what belongs in BaseAgent vs AgentLearner. For example, BaseAgent has a field for a DataBunch... Should this be here? Or should this just live in AgentLearner?
[x] I would like this subclass the actual Learner from fastai. Maybe start thinking about the shortfalls of doing this / what is in our way.

josiahls / fast-reinforcement-learning