Closed theRealSuperMario closed 6 years ago
Sounds good. you need to show me how to do that.
Although I am usually for unconditional love, I guess this would be utopic here.
On Wed, Sep 5, 2018, 16:59 Leander Kurscheidt notifications@github.com wrote:
@LeanderK commented on this pull request.
In ReinforcementLearning.tex https://github.com/ML-KA/PDG-Compendium/pull/2#discussion_r215306249:
+ +\begin{algorithm}[H]
- \For{iteration=1,2,\dots}{
- \For{actor=1,2, \dots}{
- Run policy $\pi{\theta{\mathrm{old}}}$ in environment for $T$ timesteps\
- Compute advantage estimates $\hat{A_1}, \dots, \hat{A_t}$
- }
- Optimize surrogate $L$ wrt $\theta$, with $K$ epochs and minibatch size $M \leq NT$ \
- $\theta_{old} \leftarrow \theta$
- }
- \caption{PPO, Actor-Critic Style} +\end{algorithm}
+\paragraph{Questions} +\begin{enumerate}
why not use a conditional? then we can keep the questions and the defintions together and just disable them when needed.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ML-KA/PDG-Compendium/pull/2#discussion_r215306249, or mute the thread https://github.com/notifications/unsubscribe-auth/AGNy2SL88p2vhdF_Hoxet1s_UuX_Pn-pks5uX-bcgaJpZM4WUzaF .
as discussed: \toggletrue{questions}
can someone merge maybe? I added the conditional as requested
sry for taking so long, forgot the PR. You can mention me via "@" @theRealSuperMario, this way I get a n notification
stuff from last pdg