Make TDAgent hyperparameters configurable

I don't know if I really like this change. It would work fine, so therefore I will approve, but we should maybe consider that it could give confusion as to what these parameters does in other agents, since they are not used by the agents

Hm yeah, I thought a bit about this as well, but decided to go with this approach for 2 reasons:

I thought it was more consistent with the way we handle inheritance for methods (that all players have the same methods). However I do see that an argument could be made to treat methods and attributes differently. I suppose it's a matter of taste, but I could see it be confusing as you stated. However, the only parameter not used by the other agents is Lambda, as alpha is simply a step size which could be passed to the optimiser for other agents.
I wasn't sure how to handle the **kwargs since they are being passed off the to Player parent class in the __init__. If you know how to do it and would prefer to keep Lambda only for the TDAgent, please push an update to this. you can push an update to this. Otherwise merge it.

... and also the gamma means different things for TDAgent and DirectPolicyAgents.

Are you sure that gamma has a different interpretation between the TDAgents and DirectPolicyAgent? I know that it is used for different computations, but it is just a discounting factor in both cases, right, just applied in different ways? Or am I wrong in this interpretation?

RasmusBrostroem / ConnectFourRL

Make TDAgent hyperparameters configurable #108