Some question about ogbn-mag NeighborSampling (R-GCN aggr)

https://github.com/snap-stanford/ogb/blob/master/examples/nodeproppred/mag/sampler.py When I proflied this program with Nsight, I found that many Memcpy DtoH operations appeared dispersedly in the forward and backward stages of training. There is a small gap behind Memcpy DtoH. I want to know the reason why Memcpy DtoH appears, because the data copy from D to H in the compute process is very strange. In addition, these Memcpy DtoH appear behind DeviceReduceKernel and DeviceReduceSingleTileKernel. These operations also appear in a Linear. I also want to know what is the connection between Memcpy DtoH and DeviceReduceSingleTileKernel, and why these two operations always appear together？

snap-stanford / ogb

Some question about ogbn-mag NeighborSampling (R-GCN aggr) #413