Implement experiment with DQN and DQN with EWC

@coreylowman EWC as implemented just extends the list of anchors and importance values each time new anchors are added. To be used in the DQN, this needs to be finished by:

Tagging the anchors with task ID (maybe store a dict of {name: (values, weights)}. When anchors are added, update by replacing old values if present.
The loss function call should compute losses on all except the current task's anchors.

Then add the EWC loss to the agent's loss function and use the tella task_end callback to add anchors. (or task_variant_end?)

lifelong-learning-systems / rlblocks

Implement experiment with DQN and DQN with EWC #9