HadXu / machine-learning

my machine-learning tutorial
289 stars 160 forks source link

代码疑问 #3

Closed xiaoerlaigeid closed 7 months ago

xiaoerlaigeid commented 7 years ago
def load_image(imageurl):
    im = cv2.resize(cv2.imread(imageurl),(224,224)).astype(np.float32)
    im[:,:,0] -= 103.939
    im[:,:,1] -= 116.779
    im[:,:,2] -= 123.68
    im = im.transpose((2,0,1))
    im = np.expand_dims(im,axis=0)
    return I'm

请问这个为什么三个通道要减去这个值? 后面的转置是什么目的?不太懂,希望大神解答下谢谢!

HadXu commented 7 years ago

tf端和th端的图片同道不一样,tf是channel_last,而th是channel_first,因此需要图片通道的转换

xiaoerlaigeid commented 7 years ago

谢谢解答,那减去的值是什么意思?

HadXu commented 7 years ago
    if data_format == 'channels_first':
        # 'RGB'->'BGR'
        x = x[:, ::-1, :, :]
        # Zero-center by mean pixel
        x[:, 0, :, :] -= 103.939
        x[:, 1, :, :] -= 116.779
        x[:, 2, :, :] -= 123.68
    else:
        # 'RGB'->'BGR'
        x = x[:, :, :, ::-1]
        # Zero-center by mean `pixel`
        x[:, :, :, 0] -= 103.939
        x[:, :, :, 1] -= 116.779
        x[:, :, :, 2] -= 123.68
    return x
xiaoerlaigeid commented 7 years ago

我大概明白了一点,但是不应该每一张图片的均值都不一样么?这个数值是怎么计算出来的

HadXu commented 7 years ago

不知道了,当时vgg16模型提出来的时候就是这样定义的,有兴趣可以读读VGG16论文

xiaoerlaigeid commented 7 years ago

好的,谢谢了!

371148606 commented 7 months ago

您好,我是薛文毅,已收到您的邮件,一会查收,谢谢!!