Open duncster94 opened 5 years ago
Yes, a num_workers
argument is in the pipeline, although I personally find the NeighborSampler
already quite fast :) Are you sure that this is the bottleneck? Would be interesting to see some benchmarks on this.
Interesting. It could be something I'm doing, I'll profile my code and get back to you!
You're right, it's not the NeighborSampler
after all.
I'm using a graph autoencoder to get unsupervised node embeddings. I try to reconstruct the input graph structure in the output. Since I'm batching, the loss function sees a subgraph each time its called. Obtaining this subgraph is the slow part (my own function, nothing to do with pytorch-geometric).
This brings up a good point though: say for example I have a DataFlow
object that looks like (128<-543<-831)
. My loss function requires the subgraph corresponding to the batch (i.e. a 128x128
graph). How can I efficiently obtain this subgraph from the DataFlow
object?
Thanks again!
So you are trying to reconstruct the subgraph from the DataFlow
object? Then the graph you are trying to reconstruct does not have shape 128
nodes, but rather 128+543+831
nodes, given that there are no duplicated nodes in the bipartite graphs (which is highly unrealistic). This is a tricky one, and I assume it does need a more flexible API to allow this kind of functionality (which leads to the discussion here).
The output
of my model is always a batch_size x batch_size
tensor, and the objective is to minimize the loss between output
and a batch_size x batch_size
tensor target
that corresponds to the subgraph given by the nodes in the batch.
The way I'm thinking about this is that if NeighborSampler
yields a DataFlow
like (4<-10<-12)
then my model should only compute features for the 4 nodes at the end of the bipartite graph (and give a 4x4
tensor post dot-product), not the 12 at the beginning. So my problem is: how do I subset data_flow.edge_index
to only include edges from the 4 nodes? If I can get these edges, then I can create a sparse tensor and make it dense to get the corresponding 4x4
subgraph.
Ok, I understand. You can use this function to filter your original edge_index
, which has runtime O(E). A faster way to achieve this would be to first filter out rows in your adjacency matrix and only after that calling the function (which should result in run-time O(B*D) with B being the batch-size and D being the maximum node degree). I think you can filter sparse adjacency matrices row-wise by using scipy.
Nice, thanks for the suggestion. I ended up using the subgraph
function from torch_geometric.utils
since I need to keep track of edge attributes as well. This runs significantly faster now - thanks for the help!
Ah, makes sense as well :)
🚀 Feature
Multi-processor support for
NeighborSampler
.Motivation
In my application, I use the
NeighborSampler
in order to train on large graphs. I'm finding, however, that it is very slow. On smaller graphs, when I can load the full graph into GPU memory and don't useNeighborSampler
I have a significant reduction (order of magnitude or more) in run-time. If it's possible to parallelizeNeighborSampler
, I think the speedup would be substantial.Thanks for the consideration.