-
A3C_continuous_action.py中,每个线程都可以更新全局网络参数。如果有2个线程同时更新全局网络参数,或者某个线程在更新全局网络参数,另1个线程在读取全局网络参数,会不会出现问题? 代码中好像没有机制来避免这种情况。
代码中用到了COORD.should_stop(),但没有用到COORD.request_stop(),这个COORD.should_stop()是否可以去掉…
-
Congratulation for this work !
I'm just beginner in the field of reinforcement learning and I have some basic questions.
I'm trying to implement the "a2c_gym.py" function in the Carla environment on…
-
Hi,
I have a question about the plots presented in `Ch8`, in the section of **"Training and testing the deep n-step advantage actor-critic agent"** in the book.
The Tensorbroad plots in this se…
-
### Describe the problem
We should publish results for at least a few of the standard Atari games on all applicable algorithms, and fix any discrepancies, e.g. https://github.com/ray-project/ray/issu…
ericl updated
6 years ago
-
### Describe the bug
A clear and concise description of what the bug is.
### Reproduction Script
Provide a script to reproduce the error using the following template,
replacing `` with your…
-
Hi ppwwyyxx,
Really appreciate your great work on tensorpack! I like how well your code is structured and its outstanding performance. By any change do you have pytorch version of tensorpack. I a…
-
System:
Ubuntu 16.04 64-bit
Python 2.7
pytorch '0.2.0_1
When I run the code which works correctly with `pytorch 0.1.12`, the code freezes at `x = F.relu(self.conv1(x))`(To be more specific, it f…
-
Hi I'm running the latest version of the master branch c61edd4, locally.
I kept running into a `double free or corruption` error and after searching around found that by
installing libtcmalloc…
-
loss_function中的
```python
exp_v = m.log_prob(a) * td.detach()
```
log_prob是[prob1, prob2, prob3]
td 是 [[value1, value2, value3]]
这样直接相乘得到的是一个二维矩阵,但是A2C里面不应该是对应步骤的A与对应的actor_loss相乘吗?
是否该改为
```p…
-
On the latest AWS DL AMI, I get strange errors trying to install ray from source. Not sure how to debug them:
```
Replacing /home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages…