KuangjuX / Paper-reading

My Paper Reading Lists and Notes.
15 stars 1 forks source link

Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences #17

Open KuangjuX opened 1 year ago

KuangjuX commented 1 year ago

paper: https://www.usenix.org/system/files/osdi22-han.pdf

KuangjuX commented 1 year ago

Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences

Abstract

许多只能应用像自动驾驶和虚拟现实需要 Real-time Task 和 Best-effort Task,但是目前加速器没有抢占制度,目前要不然就是独占 GPU,要不然就是 Real-time Task 等待 Best-effort Task 完成再运行,这将会造成低的利用率和高延迟,或者二者都有。本篇文章提出了一个 GPU-accelerated DNN 接口可以进行微秒级内核调度抢占。