jasperzhong commented 11 months ago

363 数据集paper的后续系统工作，做dataloader

jasperzhong commented 11 months ago

粗读这篇paper最大感受就是

对于超级大图，graph topology没必要放到SSD上，#363 最大的het graph的topology顶多100GB，完全可以in memory. 不能in memory的是features，可以上TB.

而且graph topology in SSD有很大问题，就是page fault都是以4KB为级别触发的，但sampling的memory access都是很小的，节点才多大，不像dense feature vectors...所以即便很小的segment of data is requested，整个page都要replace. 所以graph topology in pinned memory + UVA sampling基本就是最优解.

突然想到UM那种也是类似于page fault的方法，也会有同样的问题. UVA似乎没有这个问题.

jasperzhong commented 11 months ago

另外这篇paper有非常有用的profiling. 看上去Het graph的feature fetching问题更严重一些，而sampling似乎占比更低一些？

why?

jasperzhong / read-papers-and-code

arXiv '23 | Accelerating Sampling and Aggregation Operations in GNN Frameworks with GPU Initiated Direct Storage Accesses #364

363 数据集paper的后续系统工作，做dataloader