dyweb / papers-notebook

:page_facing_up: :cn: :page_with_curl: 论文阅读笔记(分布式系统、虚拟化、机器学习)Papers Notebook (Distributed System, Virtualization, Machine Learning)
https://github.com/dyweb/papers-notebook/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+-label%3ATODO-%E6%9C%AA%E8%AF%BB
Apache License 2.0
2.14k stars 250 forks source link

Performance Implications of Big Data in Scalable Deep Learning: On the Importance of Bandwidth and Caching #126

Open gaocegege opened 5 years ago

gaocegege commented 5 years ago

https://ieeexplore.ieee.org/abstract/document/8621896

Deep learning techniques have revolutionized many areas including computer vision and speech recognition. While such networks require tremendous amounts of data, the requirement for and connection to Big Data storage systems is often undervalued and not well understood. In this paper, we explore the relationship between Big Data storage, networking, and Deep Learning workloads to understand key factors for designing Big Data/Deep Learning integrated solutions. We find that storage and networking bandwidths are the main parameters determining Deep Learning training performance. Local data caching can provide a performance boost and eliminate repeated network transfers, but it is mainly limited to smaller datasets that fit into memory. On the other hand, local disk caching is an intriguing option that is overlooked in current state-of-the-art systems. Finally, we distill our work into guidelines for designing Big Data/Deep Learning solutions. (Abstract)

摘要里感觉有单词拼写错误啊 s/distill/distil

估计质量不高,但还是读一下吧 Lenovo 的工作(这是联想来着?

gaocegege commented 5 years ago

对不起,打扰了

就摆一下结论吧:i) actual setup of the storage solution does not matter as long as it can provide data at the rate needed by the compute elements and (ii) local data caching can increase performance when data bandwidth is insufficient.