Open jermainewang opened 5 years ago
DGL looks really nice. I attended the tutorial at NYU last week. Sorry for breaking your project, but as you mentioned, you were using an internal API :)
We can definitely introduce a validate_index
in the constructor of sparse_coo_matrix
. cc: @weiyangfb
However, this will not go live in binaries until 1.0.1 or 1.1. Do you have a workaround currently for this to unblock DGL working in pytorch 1.0?
Thanks @soumith . I'm glad you like DGL:) . Unfortunately, I cannot think about an alternative but use sparse_coo_matrix
for torch 1.0 (https://github.com/dmlc/dgl/pull/282). It will influence the speed of models that need to create adjacency matrices on the fly (such as those with many small graph samples, or models with dynamic graphs). We will really appreciate it if you could include this feature in a 1.0.1 patch.
Also, here are some concrete numbers about the performance influence using my old K20 GPU.
Under 0.4.1.post2
, one epoch of our GCN example takes 0.0109s
Under 0.1.0
, if switched to sparse_coo_matrix
, one epoch of our GCN example takes 0.0125s.
That's about 15% slower sadly.
There is a workaround using _indices
, _values
, _coalesced_
, and set_
. But it is fairly hacky and unsafe, so I don’t know if I should recommend it...
@SsnL it's going to be in an internal routine for DGL, so it seems like it's fine to recommend it. See: https://github.com/dmlc/dgl/pull/282/files
Okay, here you go @jermainewang
>>> def sp(index, data, size, coalesced=False):
... t = torch.sparse_coo_tensor(size=size, device=index.device, dtype=data.dtype)
... t._indices().set_(index)
... t._values().set_(data)
... t._coalesced_(coalesced)
... return t
...
>>>
>>>
>>>
>>> sp(torch.arange(4).reshape(2, 2), torch.randn(2, 3, 3), [10, 10, 3, 3], True)
tensor(indices=tensor([[0, 1],
[2, 3]]),
values=tensor([[[ 0.2279, -2.4789, 0.8396],
[-0.7172, 1.1474, -0.3711],
[ 0.7415, -0.5742, 1.1273]],
[[ 0.9689, -0.2353, -1.4202],
[-1.1111, 0.1783, -0.2180],
[-0.9308, 0.5249, -0.1092]]]),
size=(10, 10, 3, 3), nnz=2, layout=torch.sparse_coo)
>>> sp(torch.arange(4).reshape(2, 2).cuda(1), torch.randn(2, 3, 3).cuda(1).double(), [10, 10, 3, 3], True)
tensor(indices=tensor([[0, 1],
[2, 3]]),
values=tensor([[[-0.0173, -1.0477, 0.3912],
[-0.7270, 1.6093, -0.7889],
[ 1.2058, 0.8473, -1.3683]],
[[-1.1343, -0.3197, 0.5396],
[ 0.2474, -1.3241, -1.7152],
[ 1.0082, -2.2226, 0.9491]]]),
device='cuda:1', size=(10, 10, 3, 3), nnz=2, dtype=torch.float64,
layout=torch.sparse_coo)
:) I should really finish the sparse docs some time.
@SsnL Is this compatible with autograd? The set_
seems to be an inplace operation.
yes it should be fine with autograd.
@jermainewang Depends on what you mean by "compatible with autograd".. If you mean "whether the results of this can be used in autograd as a leaf collecting gradient or as a gradient of some variable", then yes, it can. If you mean "whether you can backprop the output sparse tensor to the input data
tensor", then no. However, the later case is only enabled by a patch after 0.4.1 so I doubt you mean it.
Thanks for the explanation. Support for backprop through data
tensor is very useful for models like graph attention network. We actually planned to support this by custom ops in the 0.4.1 time. Good to know that this is finally supported in v1.0!
@SsnL I tested your code a little bit and I'm confused with the coalesced
flag. Here is an example:
import torch
import numpy as np
def sp(index, data, size, coalesced=False):
t = torch.sparse_coo_tensor(size=size, device=index.device, dtype=data.dtype)
t._indices().set_(index)
t._values().set_(data)
t._coalesced_(coalesced)
return t
N = 10
# (i, i+1) pos is one, (i, i) pos is one.
row = torch.cat([torch.arange(N-1), torch.arange(N)])
col = torch.cat([torch.arange(N-1) + 1, torch.arange(N)])
index = torch.cat([row.unsqueeze(0), col.unsqueeze(0)])
data = torch.ones((len(row),))
size = (N, N)
spmat = sp(index, data, size, True)
print(spmat.to_dense())
X = torch.arange(N).unsqueeze(1).float()
Y = torch.spmm(spmat, X)
print(Y)
np_spmat = spmat.to_dense().numpy()
np_X = X.numpy()
np_Y = np.dot(np_spmat, np_X)
print(np_Y)
The sparse matrix I created here is as follows:
tensor([[1., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 1., 1., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1., 1., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 1., 1.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])
Based on the explanation in the doc, the indices contain no duplicate entries, so they should be coalesced. However, the result is wrong after a spmm
. What I got is things like:
# torch's result
tensor([[45.],
[ 1.],
[ 2.],
[ 3.],
[ 4.],
[ 5.],
[ 6.],
[ 7.],
[ 8.],
[ 9.]])
# numpy's result (also the correct result)
[[ 1.]
[ 3.]
[ 5.]
[ 7.]
[ 9.]
[11.]
[13.]
[15.]
[17.]
[ 9.]]
Hi @jermainewang , I think you code only works with the coalesce()
call:
import torch
import numpy as np
def sp(index, data, size, coalesced=False):
t = torch.sparse_coo_tensor(size=size, device=index.device, dtype=data.dtype)
t._indices().set_(index)
t._values().set_(data)
t._coalesced_(coalesced)
return t
N = 10
# (i, i+1) pos is one, (i, i) pos is one.
row = torch.cat([torch.arange(N-1), torch.arange(N)])
col = torch.cat([torch.arange(N-1) + 1, torch.arange(N)])
index = torch.cat([row.unsqueeze(0), col.unsqueeze(0)])
data = torch.ones((len(row),))
size = (N, N)
spmat = sp(index, data, size).coalesce()
print(spmat.to_dense())
X = torch.arange(N).unsqueeze(1).float()
Y = torch.spmm(spmat, X)
print(Y)
np_spmat = spmat.to_dense().numpy()
np_X = X.numpy()
np_Y = np.dot(np_spmat, np_X)
print(np_Y)
The reason is the sparse tensor you created wasn't really coalesced, the definition of coalesce is:
I see. Perhaps you should put the definition in the document. Another general question is your suggested best practice about coalescing. Shall we always coalesce indices when creating the sparse matrix? Is there any performance difference between uncoalesced spmm
and coalesced one?
@jermainewang That's a good question! When using sparse ops, you don't have to worry about coalescing as the ops will take care of both of coalesced and uncoalesced cases. For performance, in general coalesce()
is not cheap! So calling a sparse op on a coalesced sparse tensor is almost always faster than a uncoalesced one if the sparse op requires the coalesce invariance.
And yes, let me add the coalesce definition to the sparse page.
Thanks for the explanation. Now I understand. I will leave this issue open if you guys want to refer to when implementing the features.
I got the same feature request when creating dynamic sparse matrices on-the-fly.
In my case, I need to instantiate a sparse matrix of shape 25,000 25,000, with nnz = 25,000,000. The matrix is only used once in a SPMM operator. Here is my benchmark on two operators. 1) Create the COO tensor: 1.09s 2) SPMM (25,000 25,000 & 25,000 * 64): 0.03s
I think the reason is that checking the index is very slow and it is done on CPU. Given that my indices and values are already valid and placed on GPU, it is redundant to do this check.
The easiest workaround is to have a direct API to at::_sparse_coo_tensor_unsafe
in Python. Can you guys add it in Python interface later?
@SsnL It looks like your workaround doesn't work in 1.4.0. The sparse tensor is not modified.
@SsnL It looks like your workaround doesn't work in 1.4.0. The sparse tensor is not modified.
There is already a direct API to at::_sparse_coo_tensor_unsafe
in Python through torch._sparse_coo_tensor_unsafe
.
I have rewritten the sample code before using this function, as you can see we are getting the right results. So if this is good enough I think this issue should be closed. Otherwise, I can add the validate_index
parameter to the sparse_coo_tensor
API as it is suggested initially.
Looking forward your comments @KiddoZhu, @ngimel, @rgommers
# 14945
import torch
import numpy as np
def sp(index, data, size, coalesced=False):
t = torch._sparse_coo_tensor_unsafe(index, data, size=size, device=index.device, dtype=data.dtype)
t._coalesced_(coalesced)
return t
N = 10
# (i, i+1) pos is one, (i, i) pos is one.
row = torch.cat([torch.arange(N-1), torch.arange(N)])
col = torch.cat([torch.arange(N-1) + 1, torch.arange(N)])
index = torch.cat([row.unsqueeze(0), col.unsqueeze(0)])
data = torch.ones((len(row),))
size = (N, N)
spmat = sp(index, data, size, True)
print(spmat.to_dense())
X = torch.arange(N).unsqueeze(1).float()
Y = torch.spmm(spmat, X)
print(Y)
np_spmat = spmat.to_dense().numpy()
np_X = X.numpy()
np_Y = np.dot(np_spmat, np_X)
print(np_Y)
tensor([[1., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 1., 1., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1., 1., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 1., 1.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])
tensor([[ 1.],
[ 3.],
[ 5.],
[ 7.],
[ 9.],
[11.],
[13.],
[15.],
[17.],
[ 9.]])
[[ 1.]
[ 3.]
[ 5.]
[ 7.]
[ 9.]
[11.]
[13.]
[15.]
[17.]
[ 9.]]
We should either rename _sparse_coo_tensor_unsafe
to not have leading underscore, and document it, or add validate
argument. Either way, it should be a documented user-facing function.
@aocsa Exactly. The problem is that the API only exists in old PyTorch versions. I am using 1.4.0 and there is no way to access this function.
Now I solve this problem by a custom JIT module exposing at::_sparse_coo_tensor_unsafe
to Python.
I think we would accept a PR implementing a "validate" argument.
Great. I can do that.
🚀 Feature
Currently, creating a sparse coo matrix requires checking all the provided indices are within bound. However, this is costly and sometimes not needed (e.g. when the sparse matrix is from a graph adj). Request to provide a way to create sparse coo matrix w/o index check.
Motivation
The recent v1.0 update breaks an API in the DGL project. We are a library for deep learning on graphs. We built it atop of framework like Pytorch. Sparse matrix is critical in implementing efficient message passing algorithm. We found that creating a sparse matrix in pytorch is slow due to the unnecessary index check (our graph construction guarantees that all indices will be within bound). As a result, we use an internal API
_sparse_coo_tensor_unsafe
in the v0.4 time, which is removed in v1.0. It will be great if pytorch could have such support.Pitch
Add a flag in
sparse_coo_matrix
interface calledvalidate_index
. It default to beTrue
, but when setFalse
, the index check will be skipped.Alternatives
No alternatives. The internal API has been removed.
cc @aocsa @nikitaved @pearu @mruberry @VitalyFedyunin @ngimel