SSL92 / hyperIQA

Source code for the CVPR'20 paper "Blindly Assess Image Quality in the Wild Guided by A Self-Adaptive Hyper Network"
MIT License
361 stars 52 forks source link

About inference code #8

Open KleinXin opened 4 years ago

KleinXin commented 4 years ago

I used some data of my own to train the model and want to calculate the score of images one by one.

During the trainning, I added some codes to calculate the l1-norm, such as the codes below

   `for img, label in data:           

        img = img.cuda().clone().detach()
        label = label.cuda(async=True).clone().detach()

        paras = self.model_hyper(img)
        model_target = models.TargetNet(paras).cuda()
        model_target.train(False)
        pred = model_target(paras['target_in_vec'])

        pred_scores.append(float(pred.item()))
        gt_scores = gt_scores + label.cpu().tolist()

    pred_scores_np = np.array(pred_scores,dtype=np.float32)
    gt_scores_np = np.array(gt_scores,dtype=np.float32)    
    l1_norm_test = np.absolute(pred_scores_np-gt_scores_np)
    l1_norm_test = np.sum(l1_norm_test)/len(l1_norm_test)`

Then I wrote the inference codes su as the codes below

      `mean_RGB = [123.675, 116.28, 103.53]
       std_RGB =  [58.395, 57.12, 57.375]

        model_hyper = models.HyperNet(16, 112, 224, 112, 56, 28, 14, 7).cuda()
        model_hyper.load_state_dict(torch.load(args.pretrained_model_name_hyper))
        model_hyper.train(False)
        I = Image.open(imgName)
        I = I.convert("RGB")  
        I_ = I.resize((224,224))

        I_np = np.asarray(I_,dtype=np.float32).copy()

        I_np[:,:,0] = (I_np[:,:,0]-mean_RGB[0])/std_RGB[0]
        I_np[:,:,1] = (I_np[:,:,1]-mean_RGB[1])/std_RGB[1]
        I_np[:,:,2] = (I_np[:,:,2]-mean_RGB[2])/std_RGB[2]

        I_np = I_np.transpose(2,0,1)

        with torch.no_grad():

             input_var = torch.from_numpy(I_np).unsqueeze(0)
             input_var = Variable(input_var.float().cuda(0), volatile=True)

             paras = model_hyper(input_var)

             model_target = models.TargetNet(paras).cuda()
             model_target.load_state_dict(torch.load(args.pretrained_model_name_target))
             model_target.train(False)

             pred = model_target(paras['target_in_vec']).cpu()`

Here, pretrained_model_name_hyper and pretrained_model_name_target are hyper and target models saved during trainning.

During the training, I got the minimum average l1-norm 2.88, but in the test, I got 11.13. Is there anything wrong with the codes?

I guess it may have some problem in the loading process of the pretrained target model.

SSL92 commented 4 years ago

At inference stage, you should only load the parameters of hyper network and shouldn't load target network parameters, i.e.

model_target.load_state_dict(torch.load(args.pretrained_model_name_target))

this line of code should be deleted. Since the parameters of the target network is generated from the hyper network adaptively, the code 'model_target = models.TargetNet(paras).cuda()' has already built a target network which has its own self-adaptive weight parameters.

Also, I noticed you have resized all the testing images to the size of 224x224, if you trained also with images resized to 224x224, it's OK, but if the training procedure follows the configuration in our origin code, i.e. randomly cropping 224x224 patches, it'll be better to use the same configuration to testing images, since scale consistency also influences model performance.

Hope this will help you : )