pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
82.74k stars 22.29k forks source link

Sparse matrices in dataloader error #3898

Open latkins opened 6 years ago

latkins commented 6 years ago

I have noticed that using sparse matrices with a DataLoader works only if num_processes = 0.

Example:

import torch
from torch.utils.data import DataLoader

print('torch version: ', torch.__version__)

i = torch.LongTensor([[0, 1, 1], [2, 0, 2]])
v = torch.FloatTensor([3, 4, 5])
sparse_tensor = torch.sparse.FloatTensor(i, v, torch.Size([2, 3]))

dataset_sp = 2 * [sparse_tensor]

def collate_fn(batch):
    return batch[0]

loader = DataLoader(dataset_sp, batch_size=1, collate_fn=collate_fn)

print('num_workers=0: ')
for i, b in enumerate(loader):
    print(i, b.size())

loader = DataLoader(dataset_sp, batch_size=1, collate_fn=collate_fn, num_workers=1)

print('num_workers=1: ')
for i, b in enumerate(loader):
    print(i, b.size())

Gives the result:

torch version:  0.2.0+5989b05
num_workers=0:
0 torch.Size([2, 3])
1 torch.Size([2, 3])
num_workers=1:
Process Process-1:
Traceback (most recent call last):
  File "/Users/liam/anaconda/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/Users/liam/anaconda/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/liam/anaconda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 45, in _worker_loop
    data_queue.put((idx, samples))
  File "/Users/liam/anaconda/lib/python3.6/multiprocessing/queues.py", line 348, in put
    obj = _ForkingPickler.dumps(obj)
  File "/Users/liam/anaconda/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "/Users/liam/anaconda/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 46, in reduce_tensor
    metadata = (tensor.storage_offset(), tensor.size(), tensor.stride())
AttributeError: 'torch.sparse.FloatTensor' object has no attribute 'storage_offset'

cc @vincentqb

zou3519 commented 6 years ago

Looking into this

zou3519 commented 6 years ago

I'm taking the following approach to make sparse tensors work with dataloader: (I'm using this comment as a task tracker. Tasks that come earlier in the list may depend on later tasks):

Batching sparse tensors can be done by implementing a sparse torch.stack, which can be implemented with the following:

Because unsqueeze is an aten native function for dense tensors, I'd like to write an aten native function version of it for sparse tensors. That can be done by adding better support for sparse tensors to aten.

Haven't decided where to implement cat / stack, will figure that out at some point. They're currently not aten native functions for dense tensors so I might put them in THS.

zou3519 commented 6 years ago

@latkins Out of curiosity, is there something specific you're using sparse matrices + dataloader for?

latkins commented 6 years ago

I've been playing around with inductive graph convolutional networks, see for example https://arxiv.org/abs/1706.02216.

ortasa commented 6 years ago

Is there any work around that allows working with torch.sparse and batch with or without the dataloader ? Thanks, Ortal

latkins commented 6 years ago

I've been returning the indices and data as normal tensors, then batching them in a way that makes sense for my data and constructing the sparse tensor in the collate function.

ortasa commented 6 years ago

Thank you for your reply. Do you have a sample code that I can look at? Thanks, Ortal

floringogianu commented 6 years ago

@latkins I did exactly that, wrote a custom collate_fn for the DataLoader but I get the same error.

def sparse_graph_batch_collate(batch):
    sparse_batch = [(torch.sparse.LongTensor(*adj),
                     torch.sparse.LongTensor(*graph)) for adj, graph in batch]
    return sparse_batch

Or maybe you are constructing the SparseTensors outside the collate_fn too (when iterating through the batches)?

latkins commented 6 years ago

Yeah, I have been returning the (dense) indicies / values corresponding to the matrix from the collate function, and then using these to construct the sparse matrix prior to feeding it into forward.

gchanan commented 4 years ago

confirmed this is still an issue (although the error message is now RuntimeError: sparse tensors do not have storage)

Boltzmachine commented 3 years ago

Could I ask when will it be solved? or any alternative method? I also want to load adjcency matrices.

AndrewLaganaro commented 1 month ago

2024 and it still happens