ganler / ResearchReading

General system research material (not limited to paper) reading notes.
GNU General Public License v3.0
20 stars 1 forks source link

[UCI CS Seminar] DNN Training Acceleration through Better Communication-Computation Overlap #36

Closed ganler closed 3 years ago

ganler commented 3 years ago

By Sangeetha Abdu Jyothi

https://www.youtube.com/watch?v=K9DIfGmbPu8

ganler commented 3 years ago

Communication-Computation Overlap

There is a very interesting term called no sync window, which means the period that the activation must be cached from being produced to being consumed (updated).

Distributed patterns: data | model | hybrid parallel

Problem: Compute underutilization

image

So what we can do to increase the pipeline overlap is to analyze the dependency of data flows & make some priorities / orders in sending the parameters and activation;

Sangeetha classified current research on DNN training acceleration into 3 parts:

Her TicTac [MLSys'19] is for PS; Caramel is for Ring Allreduce;