LayneH commented 5 years ago

Hi, thanks for open-soursing this work.

I found that the sizes of edge ground truth you provide are always 473x473, which raises an exception when batch size is greater than 1. I resize the edge GT before it further being processed, which solves this problem.

However, doing this will result in inaccurate edge GT (since the edge GT is interpolated). I was wondering if you could give some details about edge generation or kindly share your code of generating edge GT.

qinhaifangpku commented 5 years ago

I guess the image should be resize to (437, 437) either?

I use the code beyond could genrate the edge.

python code

import cv2 import numpy as np from matplotlib import pyplot as plt import numpy as np import sys

import os

img_path = sys.argv[2] #parsing_annotation maps out_path = sys.argv[3] #result used for training and prediction out_vis_path = sys.argv[4] #just for visulization with open(sys.argv[1], 'r') as fin: # val_id.txt lines = fin.readlines() for line in lines: ids = line.strip('\n') + '.png' img = cv2.imread(os.path.join(img_path, ids)) sobelx = cv2.Sobel(img,cv2.CV_64F,1,0) # Find x and y gradients sobely = cv2.Sobel(img,cv2.CV_64F,0,1)

# Find magnitude and angle
magnitude = np.sqrt(sobelx**2.0 + sobely**2.0)
angle = np.arctan2(sobely, sobelx) * (180 / np.pi)

magnitude[np.where(magnitude > 0)] = 1
temp = magnitude.astype(np.uint8)

cv2.imwrite(os.path.join(out_path, line.strip('\n')+'.png'), temp)
cv2.imwrite(os.path.join(out_vis_path, line.strip('\n')+'.png'), temp*255.0)

LayneH commented 5 years ago

@qinhaifangpku Thanks for sharing your code! And I think you are right that images and labels should be resized to (473, 473) before random scaling.

qinhaifangpku commented 5 years ago

@qinhaifangpku Thanks for sharing your code! And I think you are right that images and labels should be resized to (473, 473) before random scaling.

yeah~ I found a weired situation that when you use the 0,1,2,3,4 GPU devices to train, the GPU id 7 would have volatile GPU-util but not have memory usage. and I wonder the batch size turn to 40, it means that every GPU devices has batch size 8 when you use 5 gpu devices for training?

LayneH commented 5 years ago

@qinhaifangpku It does happend... And you are right, Pytorch will place same number of samples in each devices.

qinhaifangpku commented 5 years ago

@qinhaifangpku It does happend... And you are right, Pytorch will place same number of samples in each devices.

So it means that if you have 8 devices with a replica, you just can use at most 7, because there should be one left for unknown reasons...

okay, how much time it will take to train this model in default config?

LayneH commented 5 years ago

@qinhaifangpku No, you can definitely use all your 8 devices. I don't think that weird situation would make a huge difference. For me, it takes about 2 days with 5 1080ti gpus.

qinhaifangpku commented 5 years ago

@qinhaifangpku No, you can definitely use all your 8 devices. I don't think that weird situation would make a huge difference. For me, it takes about 2 days with 5 1080ti gpus.

OKay, thanks! I happend to this error when tensorboadX to add image. Do you happended to it?

warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.") /usr/local/lib/python3.5/dist-packages/torch/nn/functional.py:1961: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) /usr/local/lib/python3.5/dist-packages/torch/nn/parallel/_functions.py:58: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ' Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/PIL/Image.py", line 2460, in fromarray mode, rawmode = _fromarray_typemap[typekey] KeyError: ((1, 1, 473), '|u1')

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 222, in main() File "train.py", line 202, in main writer.add_image('Labels/'+str(index), lab, i_iter) File "/usr/local/lib/python3.5/dist-packages/tensorboardX/writer.py", line 412, in add_image self.file_writer.add_summary(image(tag, img_tensor), global_step, walltime) File "/usr/local/lib/python3.5/dist-packages/tensorboardX/summary.py", line 205, in image image = make_image(tensor, rescale=rescale) File "/usr/local/lib/python3.5/dist-packages/tensorboardX/summary.py", line 243, in make_image image = Image.fromarray(tensor) File "/usr/local/lib/python3.5/dist-packages/PIL/Image.py", line 2463, in fromarray raise TypeError("Cannot handle this data type") TypeError: Cannot handle this data type

LayneH commented 5 years ago

@qinhaifangpku Yes, you need to transpose that images before writer.add_image is called. For example:

writer.add_image('Images/'+str(index), img.transpose(2, 0, 1), i_iter)
# the same for the others

qinhaifangpku commented 5 years ago

Yes, you need to transpose that images before writer.add_image is called. For example:
writer.add_image('Images/'+str(index), img.transpose(2, 0, 1), i_iter)
# the same for the others 

Okay, I try it as you suggested. It does work. Thanks!

rxqy commented 5 years ago

Hi @LayneH , @qinhaifangpku , I'm encountering a similar problem here. It seems that the data augmentation here might have some problems? Pytorch can't collect datas as a batch because the edges don't have the same shape.

The edge map is always 473*473, however, the images and gts are not. So after scale or padding, https://github.com/liutinglt/CE2P/blob/0a6628176bc0bc631e9740fe068b9368112ae4fe/dataset/datasets.py#L99-L101 img_pad, label_pad, edge_pad will not be in the same shape. And I don't think they will be aligned.

Resizing the images and gts to 473, 473 can solve the problem. My question is that, does it affect performance?

BTW, I'm using pytorch 0.3.1

LayneH commented 5 years ago

@rxqy I re-generate edges using the code above and do not have a try on resizing imgs/gts. But I guess it's not a big deal. :)

liutinglt / CE2P

Edge generation #5

python code