Open aturner-epcc opened 6 years ago
Look at how distributed memory ML benchmarks perform across different systems. Could be based on the work that Dell have described in their blogs:
http://en.community.dell.com/techcenter/high-performance-computing/b/general_hpc/archive/2018/03/05/deep-learning-performance-with-intel-caffe-training-cpu-model-choice-and-scalability
http://en.community.dell.com/techcenter/high-performance-computing/b/general_hpc/archive/2017/11/22/scaling-deep-learning-on-multiple-v100-nodes
http://en.community.dell.com/techcenter/high-performance-computing/b/general_hpc/archive/2017/09/27/deep-learning-on-v100
This is already being worked on at EPCC funded by PRACE WP7
Look at how distributed memory ML benchmarks perform across different systems. Could be based on the work that Dell have described in their blogs:
http://en.community.dell.com/techcenter/high-performance-computing/b/general_hpc/archive/2018/03/05/deep-learning-performance-with-intel-caffe-training-cpu-model-choice-and-scalability
http://en.community.dell.com/techcenter/high-performance-computing/b/general_hpc/archive/2017/11/22/scaling-deep-learning-on-multiple-v100-nodes
http://en.community.dell.com/techcenter/high-performance-computing/b/general_hpc/archive/2017/09/27/deep-learning-on-v100