apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.77k stars 6.79k forks source link

Support for single-machine multi-process KVStore based on Shared-Memory #3518

Closed peterzcc closed 6 years ago

peterzcc commented 7 years ago

I have been trying to implement the A3C algorithm. However, I have found that it is impossible to create an ndarray that is shared among different processes.

Howevery, sharing a ndarray object between different processes could be very useful since the multi-threading module on Python is so limited due to GIL.

As far as I know, the Chainer framework supports this feature and here is an example

May I ask if MXNet support similar features or anyone has been working on it?

Thanks!

piiswrong commented 7 years ago

There is generally no need to use multi processing with mxnet if you are smart about where to sync. Everything is automatically parallelized by the engine. If you have to use multi processing, use distributed kvstore. I think you can have multiple workers on the same machine. @mli

peterzcc commented 7 years ago

@piiswrong

Thanks a lot for your reply! I understand your idea, MXNet is truly good at asynchronous computation.

However, my case is to develop the "Asynchronous Method for Deep Reinforcement Learning"(A3C) algorithm.

It is multi-processor computing on a single machine with centralized neural network parameters, which usually involves more than 16 parallel actor-learners.

Multi-threading is not feasible, partly because the RL environements are not thread-safe.

Distributed KVStore is also not optimal because it will introduce too much overhead caused by network communication.

Numpy has an API that allows as to create an array from a existing memory buffer, which enables multiprocessing.

I would be willing to develop such functionality for KVStore that supports efficient single-machine, multi-process communication based on shared memory, given that it's not supported yet.

Thanks!

Peter

piiswrong commented 7 years ago

Try dist kv and see. I think connecting to 127.0.0.1 should be fast enough as it doesn't hit the network at all

mli commented 7 years ago

you can set export DMLC_LOCAL=1 to use the ipc protocol for communication.

Vogen commented 7 years ago

How your A3C implementation goes? @peterzcc

yajiedesign commented 6 years ago

This issue is closed due to lack of activity in the last 90 days. Feel free to reopen if this is still an active issue. Thanks!