DNN 학습에서 GPU 메모리가 부족한 경우, CPU 메모리와 swapping하는 방법에 관한 연구
Abstract (요약) 🕵🏻♂️
It is known that deeper and wider neural networks can achieve better accuracy. But it is difficult to continue the trend to increase model size due to limited GPU memory. One promising solution is to support swapping between GPU and CPU memory. However, existing work on swapping only handle certain models and do not achieve satisfactory performance. Deep learning computation is commonly expressed as a dataflow graph which can be analyzed to improve swapping. We propose SwapAdvisor, which performs joint optimization along 3 dimensions based on a given dataflow graph: operator scheduling, memory allocation, and swap decisions. SwapAdvisor explores the vast search space using a custom-designed genetic algorithm. Evaluations using a variety of large models show that SwapAdvisor can train models up to 12 times the GPU memory limit while achieving 53-99% of the throughput of a hypothetical baseline with infinite GPU memory.
이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔
DNN memory 사용량
model parameter: 큰 모델에서 많은 양을 차지하며, 매 iteration마다 업데이트 된다. 모델의 depth와 width에 비례
intermediate results: activation, gradient, loss 값
scratch space: 일시적인 값 저장공간으로 전체 메모리 사용량의 매우 작은 부분을 차지함
어떤 내용의 논문인가요? 👋
DNN 학습에서 GPU 메모리가 부족한 경우, CPU 메모리와 swapping하는 방법에 관한 연구
Abstract (요약) 🕵🏻♂️
It is known that deeper and wider neural networks can achieve better accuracy. But it is difficult to continue the trend to increase model size due to limited GPU memory. One promising solution is to support swapping between GPU and CPU memory. However, existing work on swapping only handle certain models and do not achieve satisfactory performance. Deep learning computation is commonly expressed as a dataflow graph which can be analyzed to improve swapping. We propose SwapAdvisor, which performs joint optimization along 3 dimensions based on a given dataflow graph: operator scheduling, memory allocation, and swap decisions. SwapAdvisor explores the vast search space using a custom-designed genetic algorithm. Evaluations using a variety of large models show that SwapAdvisor can train models up to 12 times the GPU memory limit while achieving 53-99% of the throughput of a hypothetical baseline with infinite GPU memory.
이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔
같이 읽어보면 좋을 만한 글이나 이슈가 있을까요?
Estimating GPU memory consumption of deep learning models Improving GPU Memory Oversubscription Performance
레퍼런스의 URL을 알려주세요! 🔗
https://dl.acm.org/doi/10.1145/3373376.3378530