SwapAdvisor: Push Deep Learning Beyond the GPU Memory Limit via Smart Swapping

어떤 내용의 논문인가요? 👋

DNN 학습에서 GPU 메모리가 부족한 경우, CPU 메모리와 swapping하는 방법에 관한 연구

Abstract (요약) 🕵🏻‍♂️

It is known that deeper and wider neural networks can achieve better accuracy. But it is difficult to continue the trend to increase model size due to limited GPU memory. One promising solution is to support swapping between GPU and CPU memory. However, existing work on swapping only handle certain models and do not achieve satisfactory performance. Deep learning computation is commonly expressed as a dataflow graph which can be analyzed to improve swapping. We propose SwapAdvisor, which performs joint optimization along 3 dimensions based on a given dataflow graph: operator scheduling, memory allocation, and swap decisions. SwapAdvisor explores the vast search space using a custom-designed genetic algorithm. Evaluations using a variety of large models show that SwapAdvisor can train models up to 12 times the GPU memory limit while achieving 53-99% of the throughput of a hypothetical baseline with infinite GPU memory.

이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔

DNN memory 사용량
- model parameter: 큰 모델에서 많은 양을 차지하며, 매 iteration마다 업데이트 된다. 모델의 depth와 width에 비례
- intermediate results: activation, gradient, loss 값
- scratch space: 일시적인 값 저장공간으로 전체 메모리 사용량의 매우 작은 부분을 차지함
gpu memory를 절약하기 위한 prior works: lower-precision floating points, quantization, sparsification, recompute
genetic algorithm
dataflow graph, operation schedule, memory allocation(memory pool) 요소를 GA로 최적화 하여 스왑 플랜 작성

같이 읽어보면 좋을 만한 글이나 이슈가 있을까요?

Estimating GPU memory consumption of deep learning models Improving GPU Memory Oversubscription Performance

레퍼런스의 URL을 알려주세요! 🔗

https://dl.acm.org/doi/10.1145/3373376.3378530

sypark9646 / paper-logs