Open jundanl opened 6 years ago
same question here
I'm really looking forward to visualizing the prediction R and S……
I have the same question, have you solved it?
:heavy_plus_sign:
@lixx2938 In your previous work Learning Intrinsic Image Decomposition from Watching the World the model predicts two images - 3-channel log-reflectance and 1-channel log-lighting - so I could do torch.exp(log_r + log_s)
.
But in this repository the model predicts two 1-channel images. How to use them to reconstruct the original image?
I think I solve this question now. Please read the code on computing Reconstruction loss
. Here is my explanation.
Thanks to @tlsshh, I've also solved the visualization problem. Here is my implementation code with different image set (Not IIW nor SAW) (Based on test_iiw.py)
opt = TrainOptions().parse() # set CUDA_VISIBLE_DEVICES before import torch
#root = "/home/zl548/phoenix24/"
#full_root = root +'/phoenix/S6/zl548/'
model = create_model(opt)
minc_loader = get_minc_loader(batch_size=16)
for i, (x,y) in enumerate(minc_loader):
x = x.float().cuda()
x = x.view(16, 3, 256, 256)
output_R, output_S = model.test_minc(x)
for j in range(0,output_R.size(0)):
prediction_R = output_R.data[j,:,:,:]
prediction_R = torch.exp(prediction_R)#.repeat(1,3,1,1)
prediction_S = output_S.data[j,:,:,:]
prediction_S = torch.exp(prediction_S)#.repeat(1,3,1,1)
# calc chromaticity
srgb_img = x.data[j,:,:,:]
rgb_img = srgb_to_rgb(np.transpose(srgb_img.cpu().numpy(), (1,2,0)))
rgb_img[rgb_img <1e-4] = 1e-4
chromaticity = rgb_to_chromaticity(rgb_img) # opt 1
# opt 2 chromaticity = rgb_to_chromaticity(srgb_img.cpu().numpy())
chromaticity = torch.from_numpy(np.transpose(chromaticity, (2,0,1))).contiguous().float()
p_R = torch.mul(chromaticity, prediction_R.cpu())
p_R_np = p_R.cpu().numpy()
p_R_np = np.transpose(p_R_np, (1,2,0))
p_R_np = cv2.cvtColor(p_R_np, cv2.COLOR_BGR2RGB)
p_S_np = np.transpose(prediction_S.cpu().numpy(), (1,2,0))
p_S_np = np.squeeze(p_S_np, axis=2)
save('D:/minc_decomposition/yuv/' + str(i) + '_' + str(j) + '_albedo.png', p_R_np)
save('D:/minc_decomposition/yuv/' + str(i) + '_' + str(j) + '_shading.png', p_S_np)
cv2.imwrite('D:/minc_decomposition/yuv/' + str(i) + '_' + str(j) + '_original.png',np.transpose(srgb_img.cpu().numpy(),(1,2,0)))
print('save data done')
For better understanding, here's my additional code for test_data function and data_loader
Here's test_data function which I added on intrinsic_model.py
def test_minc(self, input_):
prediction_R, prediction_S = self.netG.forward(input_)
return prediction_R, prediction_S
Here's data_loader function which I added on data_loader
class TestDataSet(Dataset):
"""
Custom Test DataSet class
"""
def __init__(self, transform=None):
x_test = []
y_test = []
test_file_name = 'D:/labels/test2.txt' # image list
test_file = open(test_file_name, 'r')
test_list = test_file.readlines()
test_file.close()
print('start loading test data')
for i in range(len(test_list)):
img = cv2.imread('D:/MINC/minc-2500/minc-2500/' + test_list[i].rstrip('\n')) # load image
img = cv2.resize(img,(256,256)) #cv2 load image with bgr format
img = np.transpose(img3, (2,0,1))
x_test.append(img)
y_test.append(1)#convert_class(label[1])) # ex) 'brick' #since given project doest not require image label, I randomly gave 1 as a label
print('loading test data done')
x = np.asarray(x_test)
y = np.asarray(y_test)
#y_test = np_utils.to_categorical(y_test[:len(test_list)], num_classes)
self.len = len(y)
#self.transform = transform
self.x_data = torch.from_numpy(x)
self.y_data = torch.from_numpy(y)
def __getitem__(self, index):
return self.x_data[index], self.y_data[index]
def __len__(self):
return self.len
def get_minc_loader(batch_size,
shuffle=True,
show_sample=False,
num_workers = 2,
pin_memory = True):
# load dataset
test_dataset = TestDataSet()
test_loader = torch.utils.data.DataLoader(
test_dataset ,batch_size= batch_size, shuffle=False,
pin_memory=pin_memory)
return test_loader
Thanks for your great project. However, I have trouble visualizing the predictions. The image looks very strange, could you please tell me how to correctly visualize them?