Closed LukeLIN-web closed 1 year ago
Most likely because PyTorch did not support the tensor with such a large size. We needed to drop some elements so that PyTorch ran fine. I am not sure if dropedge is needed in the latest Pytorch, so it may be worth a try without the hack.
Also, you are pointing to the node2vec code. Can you point us to the graphsage code you used?
Most likely because PyTorch did not support the tensor with such a large size. We needed to drop some elements so that PyTorch ran fine. I am not sure if dropedge is needed in the latest Pytorch, so it may be worth a try without the hack.
Also, you are pointing to the node2vec code. Can you point us to the graphsage code you used?
It seems that the sampler can sample and avoid large-size problems. But I met another problem.
class SAGE(torch.nn.Module):
def __init__(self,
in_channels,
hidden_channels,
out_channels,
num_layers=2):
super(SAGE, self).__init__()
self.num_layers = num_layers
self.convs = torch.nn.ModuleList()
self.convs.append(SAGEConv(in_channels, hidden_channels))
for _ in range(self.num_layers - 2):
self.convs.append(SAGEConv(hidden_channels, hidden_channels))
self.convs.append(SAGEConv(hidden_channels, out_channels))
def forward(self, x: Tensor, adjs: list) -> Tensor:
for i, (edge_index, _, size) in enumerate(adjs):
x_target = x[:size[1]] # Target nodes are always placed first.
x = self.convs[i]((x, x_target), edge_index)
if i != self.num_layers - 1:
x = F.relu(x)
# x = F.dropout(x, p=0.5, training=self.training)
return x.log_softmax(dim=-1)
@torch.no_grad()
def inference(self, x_all, device, subgraph_loader):
for i in range(self.num_layers):
xs = []
for batch_size, n_id, adj in subgraph_loader:
edge_index, _, size = adj.to(device)
x = x_all[n_id].to(device)
x_target = x[:size[1]]
x = self.convs[i]((x, x_target), edge_index)
if i != self.num_layers - 1:
x = F.relu(x)
xs.append(x)
x_all = torch.cat(xs, dim=0)
return x_all
...
y = data.y.to(rank)
x = data.x
target_node = n_id[:batch_size]
adjs = [adj.to(rank) for adj in adjs]
out = model(x[n_id].to(rank), adjs)
loss = F.nll_loss(out, y[target_node].squeeze(1))
Traceback (most recent call last): File "paper100m.py", line 75, in train loss = criterion( File "/root/share/gnnproject/microGNN/models/criterion.py", line 12, in criterion loss = F.nll_loss(logits, labels.squeeze(1)) File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 2671, in nll_loss return torch._C._nn.nll_loss_nd(input, target, weight, _Reduction.get_enum(reduction), ignore_index) RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Float'
What's the shape of out
and y
going into nll_loss
?
What's the shape of
out
andy
going intonll_loss
?
Thank you for your reply!
out torch.Size([1024, 172]) target torch.Size([1024, 1])
https://github.com/snap-stanford/ogb/pull/427#issuecomment-1501121794
Can you try to make y
a LongTensor
?
Can you try to make
y
aLongTensor
?
Thank you. It works.
https://github.com/snap-stanford/ogb/blob/a47b716f7e972f666eae9909ee0f922cd0f9d966/examples/nodeproppred/papers100M/node2vec.py#L57
I met some problems when I tried to run graphsage in the papers100M dataset. Could anybody give me some advice?