ydhongHIT / DDRNet

The official implementation of "Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes"
MIT License
415 stars 53 forks source link

Pretrained model does not work for me (ddrnet23 slim) #14

Closed dasmehdix closed 3 years ago

dasmehdix commented 3 years ago

@ydhongHIT I downloaded pretrained model for ddrnet23(cityscapes). I want to try this model on two images. As you see in below code, I resized images to (1024,2048,3) like mentioned in paper. The get_seg_model defined in somewhere else but I did not copy it to here.

import matplotlib.pyplot as plt
import cv2
import torch
import numpy as np
img = cv2.imread('E:/deneme_alani_iha/uavid_data/npy_deneme/ddrNet_city/ss.png')
img2 = cv2.imread('E:/deneme_alani_iha/uavid_data/npy_deneme/ddrNet_city/ss2.png')
img = cv2.resize(img,(2048,1024))
img2 = cv2.resize(img2,(2048,1024))
img = img.reshape(3,2048,1024)
img2 = img2.reshape(3,2048,1024)
data = [img,img2]
data = np.array(data)
data = torch.Tensor(data)

#%%
model = get_seg_model(1)
with torch.no_grad():
    output = model(data)
#%%
plt.imshow(output[0].cpu().detach().numpy().reshape(128,256,19)[:,:,0])
#%%
def eight2twoconverter(labels, size):
    colors = [  (150,120,90), (153,153,153), (153,153,153), (250,170,30), (220,220,0), (107,142,35), (152,251,152), ( 70,130,180), (220,20,60), (255,0,0), (  0,0,142), (  0,0,70), (  0,60,100), (  0,0,90), (  0,0,110), (  0,80,100), (  0,0,230), (119,11,32), (  0,0,142)]

    b = np.zeros(( size, 256,3))
    for i in range(b.shape[0]):
        for j in range(b.shape[1]):
            indexx = np.argmax(labels[i,j,:])
            b[i,j]  = colors[indexx]
    return b
#%%
out = eight2twoconverter(output[0].cpu().detach().numpy().reshape(128,256,19), 128)
plt.imshow(out)

I gave correct path and num_classes parameter to this function.

def DualResNet_imagenet(pretrained=True):
    model = DualResNet(BasicBlock, [2, 2, 2, 2], num_classes=19, planes=32, spp_planes=128, head_planes=64, augment=False)
    if pretrained:
        checkpoint = torch.load("E:/deneme_alani_iha/uavid_data/npy_deneme/best_val_smaller.pth", map_location='cpu') 

        new_state_dict = OrderedDict()
        for k, v in checkpoint['state_dict'].items():
            name = k[7:]  
            new_state_dict[name] = v
        model_dict.update(new_state_dict)
        model.load_state_dict(model_dict)

        model.load_state_dict(new_state_dict, strict = False)
    return model

Figure 2021-06-15 122029 Raw Image Figure 2021-06-15 122021 Output

As you see above output of the network is some thing like a random picture. How can I solve it?

ydhongHIT commented 3 years ago

@ydhongHIT I downloaded pretrained model for ddrnet23(cityscapes). I want to try this model on two images. As you see in below code, I resized images to (1024,2048,3) like mentioned in paper. The get_seg_model defined in somewhere else but I did not copy it to here.

import matplotlib.pyplot as plt
import cv2
import torch
import numpy as np
img = cv2.imread('E:/deneme_alani_iha/uavid_data/npy_deneme/ddrNet_city/ss.png')
img2 = cv2.imread('E:/deneme_alani_iha/uavid_data/npy_deneme/ddrNet_city/ss2.png')
img = cv2.resize(img,(2048,1024))
img2 = cv2.resize(img2,(2048,1024))
img = img.reshape(3,2048,1024)
img2 = img2.reshape(3,2048,1024)
data = [img,img2]
data = np.array(data)
data = torch.Tensor(data)

#%%
model = get_seg_model(1)
with torch.no_grad():
    output = model(data)
#%%
plt.imshow(output[0].cpu().detach().numpy().reshape(128,256,19)[:,:,0])
#%%
def eight2twoconverter(labels, size):
    colors = [  (150,120,90), (153,153,153), (153,153,153), (250,170,30), (220,220,0), (107,142,35), (152,251,152), ( 70,130,180), (220,20,60), (255,0,0), (  0,0,142), (  0,0,70), (  0,60,100), (  0,0,90), (  0,0,110), (  0,80,100), (  0,0,230), (119,11,32), (  0,0,142)]

    b = np.zeros(( size, 256,3))
    for i in range(b.shape[0]):
        for j in range(b.shape[1]):
            indexx = np.argmax(labels[i,j,:])
            b[i,j]  = colors[indexx]
    return b
#%%
out = eight2twoconverter(output[0].cpu().detach().numpy().reshape(128,256,19), 128)
plt.imshow(out)

I gave correct path and num_classes parameter to this function.

def DualResNet_imagenet(pretrained=True):
    model = DualResNet(BasicBlock, [2, 2, 2, 2], num_classes=19, planes=32, spp_planes=128, head_planes=64, augment=False)
    if pretrained:
        checkpoint = torch.load("E:/deneme_alani_iha/uavid_data/npy_deneme/best_val_smaller.pth", map_location='cpu') 

        new_state_dict = OrderedDict()
        for k, v in checkpoint['state_dict'].items():
            name = k[7:]  
            new_state_dict[name] = v
        model_dict.update(new_state_dict)
        model.load_state_dict(model_dict)

        model.load_state_dict(new_state_dict, strict = False)
    return model

Figure 2021-06-15 122029 Raw Image Figure 2021-06-15 122021 Output

As you see above output of the network is some thing like a random picture. How can I solve it?

Do you get the correct val iou? You can use 'HRNet-Semantic-Segmentation-pytorch-v1.1' to evaluate and visualize the results.

Fritskee commented 3 years ago

I'm pretty sure you're looking at the network activations instead of the actual segmentation map

dasmehdix commented 3 years ago

I'm pretty sure you're looking at the network activations instead of the actual segmentation map

Yes, the problem was inside my converter function.

aaj22 commented 3 years ago

@dasmehdix Could you provide the corrected converter function!! Thanks!!

dasmehdix commented 3 years ago

@dasmehdix Could you provide the corrected converter function!! Thanks!!

I was using "numpy.reshape" method to change axis on images. I realized that this method change some values inside array. So, I used "numpy.moveaxis" method. My problem solved.

I generally use this line for visualize outputs(with matplotlib): plt.imshow(outputs[0].argmax(0).cpu())

aaj22 commented 3 years ago

@dasmehdix Could you share the code as you did above, I am not able to get the required output

songbingyue commented 9 months ago

Excuse me, have you solved this problem now? I also want to do visualization. Can you share your visualization code