Open CheungYooo opened 5 years ago
Do you understand it now? I also get confused about these values.
you should not use those value. As the author uses pretrain weights instead of training from zero. However, if you train from zero, you need those values then.
it seems he is taking mean pixel values across each channel (R, B, G) and then subtract them for the sake of normalization
@CheungYooo I think you should use following code instead:
`from torchvision import datasets, transforms transform=transforms.Compose([ transforms.ToTensor(),transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ])
mae = 0 for i in range(len(img_paths)): img = transform(Image.open(img_paths[i]).convert('RGB')).cuda() gt_file = h5py.File(img_paths[i].replace('.jpg','.h5').replace('images','ground_truth'),'r') groundtruth = np.asarray(gt_file['density']) output = model(img.unsqueeze(0)) mae += abs(output.detach().cpu().sum().numpy()-np.sum(groundtruth)) print(i,mae) print (mae/len(img_paths))`
img[0,:,:]=img[0,:,:]-92.8207477031 img[1,:,:]=img[1,:,:]-95.2757037428 img[2,:,:]=img[2,:,:]-104.877445883
@leeyeehoo Thank you for your sharing. When you validate CSRNet with 'val.ipynb',you use the code above.So,my question is why you minus the specific values above(92.8207477031,95.2757037428,104.877445883).What is the mean of the values?Why can't they be other values?