yahoo / CaffeOnSpark

Distributed deep learning on Hadoop and Spark clusters.
Apache License 2.0
1.27k stars 358 forks source link

how to set SPARK_WORKER_INSTANCES and Device? #282

Closed GoodJoey closed 6 years ago

GoodJoey commented 7 years ago

Hello, i have 4 nodes, each has two GPU , enough memory and hard disk space. So what's the good way to set the parameters (SPARK_WORKER_INSTANCES and Device), so i can use the resource more efficiently.

set 4 masters and slaves? SPARK_WORKER_INSTANCES = 4? Device =2? ps: i've installed CaffeOnSpark(gpu version) on all of the machines.

One more question, the CUDA version has to be 7.5? can we use 8.0? Thanks.

junshi15 commented 7 years ago

your config looks fine. If you compile with cuda 8.0, then you should be able to run with 8.0. Cudnn version is tricky. 5.1 works, but higher cudnn does not compile. https://github.com/BVLC/caffe/issues/5793

GoodJoey commented 6 years ago

thanks