-
Hi! First and foremost, fantastic work! I'm trying to replicate the performance shown in the paper for the Q-Learning baselines locally; however, using the exact versions provided in your requirements…
-
Firstly, I ran command like this:
_python3 src/main.py --config=**qmix_smac** --env-config=sc2 with env_args.map_name=3m save_replay=True save_model=True_
and then I loaded model by running:…
-
Hey everyone,
trying to run Ape-X with tune.run() on ray 1.3.0 and the status remains "pending". I get the same message indefinitely
== Status ==
Memory usage on this node: 7.5/19.4 GiB
Using…
-
Hello,
Thanks for the tf2.x implementation of MADDPG. Appreciate your effort.
I was wondering if you know the reason why does PER MADDPG perform worse than MADDPG in terms of the rewards. I test…
-
- [ ] I have marked all applicable categories:
+ [ ] exception-raising bug
+ [ ] RL algorithm bug
+ [ ] system worker bug
+ [ ] system utils bug
+ [ ] code design/refactor
…
-
Hi when I run python idqn.py, or vdn.py or qmix.py, there is a rumtime error.
"RuntimeError: cannot perform reduction function argmax on a tensor with no elements because the operation does not have …
-
I have been training a stylegan2 model for the past month and haven't had any issues until yesterday this occuring:
cannot import name 'notf' from 'tensorboard.compat'
Any solutions on the way?
…
-
It would be nice to be able to pull multiple quantiles from a single distribution at the same time.
```r
dist
-
Gezx updated
12 months ago
-
作者你好,我想再一下,关于代码中`reuse_network = True`,是代表每个agent都共用一个agnet network对吧?那这样的话,会不会最后造成每个智能体都产生相同的动作呢?
因为我在自建的环境中使用qmix算法,整体reward训练后越来越差。而且如果将epsilon定为0之后,好像智能体都趋向于选择相同的动作,而这种相同的动作恰恰会在环境中带来很大的惩罚值。实在不明白是…