yehyunsuh / TEM-Image-Segmentation

1 stars 0 forks source link

[Bug] Dataloader Error #5

Closed yehyunsuh closed 1 year ago

yehyunsuh commented 1 year ago

What

''' Traceback (most recent call last): File "/home/yehyun/TEM-Image-Segmentation/main.py", line 54, in main(args) File "/home/yehyun/TEM-Image-Segmentation/main.py", line 26, in main train(args, DEVICE, model, loss_fn, optimizer, train_loader, val_loader) File "/home/yehyun/TEM-Image-Segmentation/train.py", line 134, in train loss = train_function( File "/home/yehyun/TEM-Image-Segmentation/train.py", line 13, in train_function for image, label in tqdm(loader): File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/tqdm/std.py", line 1178, in iter for obj in iterable: File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 628, in next data = self._next_data() File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data return self._process_data(data) File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data data.reraise() File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/torch/_utils.py", line 543, in reraise raise exception ValueError: Caught ValueError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/yehyun/TEM-Image-Segmentation/dataset.py", line 58, in getitem x_data = self.transform(x_data) File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/torchvision/transforms/transforms.py", line 95, in call img = t(img) File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/torchvision/transforms/transforms.py", line 135, in call return F.to_tensor(pic) File "/home/yehyun/anaconda3/envs/TEM/lib/python3.9/site-packages/torchvision/transforms/functional.py", line 149, in to_tensor img = torch.from_numpy(pic.transpose((2, 0, 1))).contiguous() ValueError: At least one stride in the given numpy array is negative, and tensors with negative strides are not currently supported. (You can probably work around this by making a copy of your array with array.copy().) ''' This error occurs when getting the data from the train_loader.

Why

Problem of transforms.ToTensor(), where it does not expect a negative stride.

TODO

yehyunsuh commented 1 year ago

Solution

The problem it had was that the stride of x_data on all the training dataset had negative values

x_data stride:  (-8192, 4)
y_data stride:  (16384, 8)
x_data stride:  (-8192, 4)
y_data stride:  (16384, 8)

Fixed the error by changing the code from

if self.transform:
            x_data = self.transform(x_data)
            y_data = self.transform(y_data)

to

if self.transform:
            x_data = self.transform(x_data.copy())
            y_data = self.transform(y_data.copy())

and we get

x_data.copy() stride:  (8192, 4)
x_data.copy() stride:  (8192, 4)
y_data.copy() stride:  (16384, 8)
y_data.copy() stride:  (16384, 8)

How it fixed the error

ndarray.copy() will alocate new memory for numpy array which make it normal, I mean the stride is not negative any more.

Reference: https://discuss.pytorch.org/t/torch-from-numpy-not-support-negative-strides/3663/3