apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.77k stars 6.8k forks source link

A3C with multiple workers #8397

Closed ghost closed 6 years ago

ghost commented 6 years ago

Hi,

When I run the a3c from this repo using the file launcher.py, it can specify the number of workers. When I specify more than 1, I get the error: no module named dmlc_tracker

I checked this closed issue and tried to add

sys.path.insert(0, "your/mxnet/python/")

before importing mxnet but the problem is still the same.

Do you have any suggestion about this problem ?

Thank you!

Med

anirudh2290 commented 6 years ago

@moakra As this script prepends OS.ENVIRON['HOME'] to the sys.path (https://github.com/apache/incubator-mxnet/blob/master/example/reinforcement-learning/a3c/launcher.py#L30), I would suggest either installing mxnet in the home directory (/home/) or changing the environment variable HOME temporarily by doing export HOME=

ghost commented 6 years ago

@anirudh2290 thank you. I changed the environment variable.

Now, when I run:

python2 launcher.py -n 2 --gpus=0 python2 a3c.py

I see 5 processes with nvidia-smi. It is supposed to be 2 only. Doesn't it ? Is this normal ?

anirudh2290 commented 6 years ago

@moakra 5 processes are expected. 1 Parent process, 4 child processes, 2 workers and 2 servers.

ghost commented 6 years ago

Thank you @anirudh2290 !