Performance Implications of Big Data in Scalable Deep Learning: On the Importance of Bandwidth and Caching

https://ieeexplore.ieee.org/abstract/document/8621896

Deep learning techniques have revolutionized many areas including computer vision and speech recognition. While such networks require tremendous amounts of data, the requirement for and connection to Big Data storage systems is often undervalued and not well understood. In this paper, we explore the relationship between Big Data storage, networking, and Deep Learning workloads to understand key factors for designing Big Data/Deep Learning integrated solutions. We find that storage and networking bandwidths are the main parameters determining Deep Learning training performance. Local data caching can provide a performance boost and eliminate repeated network transfers, but it is mainly limited to smaller datasets that fit into memory. On the other hand, local disk caching is an intriguing option that is overlooked in current state-of-the-art systems. Finally, we distill our work into guidelines for designing Big Data/Deep Learning solutions. (Abstract)

摘要里感觉有单词拼写错误啊 s/distill/distil

估计质量不高，但还是读一下吧 Lenovo 的工作（这是联想来着？

dyweb / papers-notebook

Performance Implications of Big Data in Scalable Deep Learning: On the Importance of Bandwidth and Caching #126