Introductory sentences, for each of these algos we did hyp grid search i.e. everything in common place here
2.1 VPG: Explain details of algo, hyperparameters, for the ones we saw in class less detail, include links openai, show more the motivation, explain implementation details of OPenAI as well
2.4 TD3: Explain theory a bit more since we didn't cover it
2.5 SAC: Theory + variant
3 Results:
3.1 and 3.2 @sabina-elkins
PLots: train best agent plots for all 5 agents, + cumulative reward, run train_best_agent.py
3.3 Adaptability @etiennedemers
Moral support: @cesare-spinoso
Video: Start, middle and end of VPG (for illustration) and then the evaluation video for each of them
1 Introduction: @etiennedemers
2 Methods: @etiennedemers @j-c-carr
Introductory sentences, for each of these algos we did hyp grid search i.e. everything in common place here
2.1 VPG: Explain details of algo, hyperparameters, for the ones we saw in class less detail, include links openai, show more the motivation, explain implementation details of OPenAI as well
2.4 TD3: Explain theory a bit more since we didn't cover it
2.5 SAC: Theory + variant
3 Results:
3.1 and 3.2 @sabina-elkins
PLots: train best agent plots for all 5 agents, + cumulative reward, run
train_best_agent.py
3.3 Adaptability @etiennedemers
Moral support: @cesare-spinoso