issues
search
MorvanZhou
/
Reinforcement-learning-with-tensorflow
Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学
https://mofanpy.com/tutorials/machine-learning/reinforcement-learning/
MIT License
8.91k
stars
5.01k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
PPO convergence
#167
aliamiri1380
opened
4 years ago
0
PPO中如何处理不同长度的episode?
#166
YingxiaoKong
opened
4 years ago
0
DPPO完全写错了,worker推送的是梯度而不是样本
#165
GIS-PuppetMaster
closed
4 years ago
3
DDPG: Actor target network is a garbage. ---> sorry!! misunderstading
#164
hccho2
closed
4 years ago
0
Simple PPO.py
#163
GIS-PuppetMaster
closed
4 years ago
1
使用DDPG探索范围很小
#162
YingxiaoKong
opened
4 years ago
4
Prioritized_Replay_DQN not working
#161
alirezakazemipour
opened
4 years ago
0
[revised] .ix to .loc AND [add] .astype() for Type error: reduction operation .argmax()
#160
ghost
closed
4 years ago
0
如何限制输出的动作不小于0?
#159
YingxiaoKong
closed
4 years ago
1
ddpg算法没有收敛.没有复现视频的结果
#158
zhangbo2008
closed
4 years ago
1
DDPG action 不需要normalize 吗?
#157
YingxiaoKong
closed
4 years ago
1
关于第一章get_env_feedback 公平问题
#156
fengchaohao
opened
4 years ago
0
Questions about the 5.2_Prioritized_Replay_DQN
#155
seey0u
opened
4 years ago
0
关于actor多维连续动作值的概率密度构建
#154
sasforce
opened
4 years ago
2
v_s_ = 0, when the last step is terminal.
#153
hyc6668378
closed
5 years ago
0
Sarsa 算法最后只在起点附近移动
#152
virdel
closed
5 years ago
0
Inverted the meaning of epsilon in the Q-Learning algorithm
#151
douglasrizzo
closed
5 years ago
1
Mac how to use tkinter
#150
ZhuYun97
closed
5 years ago
1
ValueError: invalid literal for int() with base 10: 'None' when run 'env.render()'
#149
xingyueye
opened
5 years ago
0
PPO : Multiply Mu *2 ?
#148
lhorus
opened
5 years ago
1
PPO and Reward
#147
yangtianyong
opened
5 years ago
0
Added RND_PPO.py, RND with PPO (solves MountainCarContinuous-v0).
#146
ChuaCheowHuan
closed
5 years ago
1
ModuleNotFoundError: No module named 'vnpy.api.ctp.vnctpmd'
#145
xinjiyuan97
closed
5 years ago
0
ddpg的tf.keras改写
#144
QiuChenFeng
opened
5 years ago
1
代码下载下来后训练不收敛是什么问题呢
#143
niniuba123456
opened
5 years ago
0
bug_issue: A3C环境交互step() 后返回的done 被下面一行判断覆盖了.
#142
hyc6668378
opened
5 years ago
0
States in the Environment.
#141
Kalpan13
opened
5 years ago
1
simply_PPO中与环境交互时为什么不使用old_pi而是pi
#140
Qiyangcao
opened
5 years ago
3
Dueling DQN里为什么target net 没有lock weight啊
#139
Qiyangcao
closed
5 years ago
0
discrete_DPPO: 多线程 env.render() 显示
#138
wangyubin112
closed
5 years ago
0
How can solve the problem of action == Nan in PPO?
#137
niu0717
opened
5 years ago
2
fixed a 'runs slowly gradually' problem
#136
Gaoee
closed
5 years ago
1
question about tf.GraphKeys
#135
WillysMa
opened
5 years ago
0
请问morvan有如何写simulator也就是环境environment的教程吗?
#134
DaDaDoDoLee
opened
5 years ago
1
这里的回报 r 具体指什么?如何根据自己的问题修改代码以获得回报r?
#133
liudading
opened
5 years ago
1
如何画奖励与训练回合的关系图?
#132
Curry30h
opened
5 years ago
1
Save and Reuse of DDPG model
#131
lyjge
opened
5 years ago
1
How to print Actor and Critic Loss in DDPG update 2?
#130
ghost
closed
5 years ago
0
simply_PPO中update_oldpi_op是否有错?
#129
janyChan
closed
5 years ago
0
REINFORCE中对discounted reward的centralize的依据是什么?
#128
ZefanW
opened
5 years ago
0
About Atari
#127
Precola
opened
5 years ago
3
find a bug in DDPG.py
#126
jiangyuzhao
closed
4 years ago
1
fix a bug in DDPG.py.
#125
jiangyuzhao
closed
4 years ago
9
fix a bug in DDPG.py.
#124
jiangyuzhao
closed
5 years ago
0
关于PPO具体使用
#123
janyChan
opened
5 years ago
0
hi,你的simply_ppo代码中的几个错误:
#122
clicdl
closed
5 years ago
1
a3c的疑问
#121
icesit
opened
5 years ago
3
Why is ReUse for? DDPG
#120
ghost
closed
5 years ago
0
game
#119
LexieeWei
closed
5 years ago
0
Save the model
#118
afcentry
closed
5 years ago
1
Previous
Next