Refactor Agent to Include Memory

Proposal:

Agent should be a class
nn.Modules should be a tool used by the agent, not the basis for the agent
Memory (trajectory buffer) should be inside of the agent as part of its "memory module"

Pros:

Agent can store intermediate results in memory to avoid recomputing
Algorithmic based planning can be added more easily (e.g. tree search algorithms)
Agent has direct access to its memory

Cons:

Breaks the traditional reinforcement learning setup for storing the memory external to the policy
Significant refactoring (good thing we have tests!)

I think this refactoring will benefit the project in the long run. It will also make it easier to implement ICM (#14 ) because we can now have access to the intermediate results of the visual processing module which is needed to extract visual features.

Basically, the idea is to have every module of the agent able to access every other module. This was almost the case already, however, the "memory" was separate.

thomashopkins32 / Minecraft-Virtual-Intelligence

Refactor Agent to Include Memory #18