jasperzhong / read-papers-and-code

My paper/code reading notes in Chinese
44 stars 3 forks source link

MLSys '23 | Communication-Efficient Graph Neural Networks with Probabilistic Neighborhood Expansion Analysis and Caching #327

Closed jasperzhong closed 1 year ago

jasperzhong commented 1 year ago

https://arxiv.org/pdf/2305.03152.pdf

jasperzhong commented 1 year ago

提出了一个比 #312 更优的static cache方法,逼近理论上限了,6.

首先是提出一个公式,可以计算一个点被采样的概率;然后对于每个partition,cache访问概率最大的remote vertices. 就这么简单. 但效果很好.

他们的cache直接是feature store上做replication,当然也可以应用于GPU cache.

image

这是GraphSAGE上3-layer random uniform sampling. 可以看到VIP效果比sim (GNNLab)要好,而且几乎贴近oracle(根据实际access frequency如何做cache,这是communication lower bound).

jasperzhong commented 1 year ago

image image image

这理论似乎也不是很复杂. 但是看上去没考虑temporal sampling.

jasperzhong commented 1 year ago

除此之外还有一些其他的系统优化,包括vertex partitioning(distDGL不是有这个吗?),data replication(说的不就是feature cache吗)和pipeline (distDGL不是也有吗?)