kengz / SLM-Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".
https://slm-lab.gitbook.io/slm-lab/
MIT License
1.24k stars 264 forks source link

Addition of the cross entropy method #446

Closed ingambe closed 4 years ago

ingambe commented 4 years ago

Addition of the cross-entropy method

This is more an exercise for me to get used to the lab rather than a very useful algorithm, but still can be interesting for some ... I guess. To implement this, I have defined a new on-policy called OnPolicyCrossEntropy which inherit from OnPolicyReplay

Experiment Title

Abstract

Small experiment on the cartpole environment (CartPole-v0). As for the REINFORCE baseline spec, we allow 100000, 4 sessions and 1 trial. The training frequency is set to 16 and the cross_entropy coefficient to 0.5

Methodology

REINFORCE algorithm with Cross-Entropy On-Policy

Reproduction

  1. spec file location: slm_lab/spec/benchmark/reinforce/reinforce_cartpole.json
  2. git SHA : f133d4811486425ea1e06724a1e0ca5967396660

Run command: python run_lab.py slm_lab/spec/benchmark/reinforce/reinforce_cartpole.json reinforce_cross_entropy_cartpole train

Result and Discussion

reinforce_cross_entropy_cartpole_t0_trial_graph_mean_returns_ma_vs_frames reinforce_cross_entropy_cartpole_t0_trial_graph_mean_returns_vs_frames

The result on cartpole is not that good (compare to REINFORCE alone)

Data zipfile url: reinforce_cross_entropy_cartpole_2020_02_14_171517.zip

ingambe commented 4 years ago

Hi @kengz, Sorry for the delay, The cross-entropy method I refer to his the one described here: Cross-Entropy Method ("2- The Cross-Entropy Method for Optimization)