The parsing network does not give desired output

liaohaofu commented 7 years ago

Hi @Yijunmaverick ,

When I was trying to run your parsing demo, I can not get it work correctly. I implemented your demo in python by following the matlab code step by step. However, the output (as shown below) does not give a good segmentation result. Could you please help me to point out what is wrong with the following code? Or maybe the .caffemodel file itself is not the correct one?

Thanks, Haofu

import caffe
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt

model_def = "Model_parsing.prototxt"
model_weights = "Model_parsing.caffemodel"
net = caffe.Net(model_def, model_weights, caffe.TEST)
caffe.set_mode_cpu()

image_file = "../../repos/gfc/matlab/FaceCompletion_testing/TestImages/182701.png"
image = np.array(Image.open(image_file))

# preprocessing the image to fit the net requirement
input_ = image / 255.0
input_ = input_ * 2 - 1
input_ = input_.transpose(2, 0, 1)
input_ = input_[np.newaxis, ...]

net.blobs['data'].reshape(*input_.shape)
net.blobs['data'].data[...] = input_
output = net.forward()
scores = output['conv_decode0'][0]
segmentation = scores.argmax(0)
segmentation_rgb = np.zeros(image.shape, dtype=np.uint8)

colors = [
    [0, 0, 255],
    [255, 255, 0],
    [160, 32, 240],
    [218, 112, 214],
    [210, 105, 30],
    [94, 38, 18],
    [0, 255, 0],
    [156, 102, 31],
    [0, 0, 0],
    [255, 127, 80],
    [255, 0, 0]
]

for i in range(11):
    segmentation_rgb[np.where(segmentation == i)] = colors[i]

plt.figure(1)
plt.subplot(121)
plt.imshow(image)
plt.subplot(122)
plt.imshow(segmentation_rgb)

Yijunmaverick commented 7 years ago

Hi @liaohaofu, thanks for your interests about our work.

I just run my model and the result (below) looks good. Not sure if it is due to the processing steps on input and ouput.

My guess is on this line "segmentation = scores.argmax(0)". Do you get the max along the channel (channel = 11) dimension?

f1_parsing

liaohaofu commented 7 years ago

Yes, the max is along the channel dimension. In pycaffe, the channel dimension always comes first. So, "input_" has a shape of 1x3x128x128 (1 is the batch size) and "scores" has a shape of 11x128x128. Could you run my code on your machine to see if it works correctly. You only need to change the related file paths to make it work if you have your caffe compiled with anaconda. Thanks a lot!

liaohaofu commented 7 years ago

The issue is related to preprocessing. It looks like that, somehow for pycaffe, the dimensions of the input data should be totally reversed. I have to change

input_ = input_.transpose(2, 0, 1)

to

input_ = input_.transpose(2, 1, 0)

to make the code work. Thanks again for your help! @Yijunmaverick

Yijunmaverick / GenerativeFaceCompletion

The parsing network does not give desired output #3