Open latkins opened 6 years ago
Looking into this
I'm taking the following approach to make sparse tensors work with dataloader: (I'm using this comment as a task tracker. Tasks that come earlier in the list may depend on later tasks):
Batching sparse tensors can be done by implementing a sparse torch.stack, which can be implemented with the following:
stack
for sparse tensors
unsqueeze
for sparse tensorscat
for sparse tensorsBecause unsqueeze
is an aten native function for dense tensors, I'd like to write an aten native function version of it for sparse tensors. That can be done by adding better support for sparse tensors to aten.
_indices
and _values
for sparse tensors in aten by exposing THS's functionsHaven't decided where to implement cat / stack, will figure that out at some point. They're currently not aten native functions for dense tensors so I might put them in THS.
@latkins Out of curiosity, is there something specific you're using sparse matrices + dataloader for?
I've been playing around with inductive graph convolutional networks, see for example https://arxiv.org/abs/1706.02216.
Is there any work around that allows working with torch.sparse and batch with or without the dataloader ? Thanks, Ortal
I've been returning the indices and data as normal tensors, then batching them in a way that makes sense for my data and constructing the sparse tensor in the collate function.
Thank you for your reply. Do you have a sample code that I can look at? Thanks, Ortal
@latkins I did exactly that, wrote a custom collate_fn
for the DataLoader but I get the same error.
def sparse_graph_batch_collate(batch):
sparse_batch = [(torch.sparse.LongTensor(*adj),
torch.sparse.LongTensor(*graph)) for adj, graph in batch]
return sparse_batch
Or maybe you are constructing the SparseTensors outside the collate_fn
too (when iterating through the batches)?
Yeah, I have been returning the (dense) indicies / values corresponding to the matrix from the collate function, and then using these to construct the sparse matrix prior to feeding it into forward.
confirmed this is still an issue (although the error message is now RuntimeError: sparse tensors do not have storage
)
Could I ask when will it be solved? or any alternative method? I also want to load adjcency matrices.
2024 and it still happens
I have noticed that using sparse matrices with a DataLoader works only if num_processes = 0.
Example:
Gives the result:
cc @vincentqb