ramprs / grad-cam

[ICCV 2017] Torch code for Grad-CAM
https://arxiv.org/abs/1610.02391
1.5k stars 226 forks source link

Is there a model pre-trained on MSCOCO with 80classes? #2

Closed BAILOOL closed 7 years ago

ramprs commented 7 years ago

Hi @BAILOOL, I don't have a VGG model fine-tuned on COCO. I have been using the GoogLe-Net model fine-tuned on COCO from http://www.cs.bu.edu/groups/ivc/data/ExcitationBP/COCO/ for my experiments.

BAILOOL commented 7 years ago

@ramprs thanks for quick reply. what layer_name did you use with GoogLe-Net?

ramprs commented 7 years ago

As far as I know, loadcaffe doesn't allow loading inception type modules trained on caffe. Hence I used the following caffe snippet to get Grad-CAM visualizations from the above model.

def GradCAM(net, img, classID):

    topBlobName = 'loss3/classifier'
    topLayerName = 'loss3/classifier'
    outputLayerName = 'inception_5b/output'
    outputBlobName = 'inception_5b/output'

    # load image, rescale
    minDim = min(img.shape[:2])
    newSize = (int(img.shape[0]*imgScale/float(minDim)), int(img.shape[1]*imgScale/float(minDim)))
    imgS = transform.resize(img, newSize)

    # reshape net
    net.blobs['data'].reshape(1,3,newSize[0],newSize[1])
    transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
    transformer.set_mean('data', np.array([103.939, 116.779, 123.68]))
    transformer.set_transpose('data', (2,0,1))
    transformer.set_channel_swap('data', (2,1,0))
    transformer.set_raw_scale('data', 255.0)
    caffe.set_mode_gpu()

    # forward pass
    net.blobs['data'].data[...] = transformer.preprocess('data', imgS)
    out = net.forward(end = topLayerName)

    # create grad Input
    net.blobs[topBlobName].diff[0][...] = 0
    net.blobs[topBlobName].diff[0][classID] = 1

    # get feature maps from forward pass
    fprop_maps = net.blobs[outputLayerName].data[0]

    # backward pass till last conv layer (inception5b/output)
    out = net.backward(start = topLayerName, end = outputLayerName)
    # get weights of maps
    map_weights = net.blobs[outputLayerName].diff[0].sum(1).sum(1)
    map_weights = map_weights.repeat(fprop_maps.shape[1]*fprop_maps.shape[2]).reshape(map_weights.shape[0],fprop_maps.shape[1],fprop_maps.shape[2])

    gradCAM_beforeReLU = np.multiply(fprop_maps,map_weights).sum(0)

    #pass through ReLU 
    gradCAM = (np.maximum(gradCAM_beforeReLU,0))
    gradCAM = transform.resize(Normalize(gradCAM), (img.shape[:2]), order = 3, mode = 'nearest')

    return gradCAM

Please let me know if you face any issues. Also, I should be able to get a VGG-16 network fine-tuned on COCO in a weeks time if you would like to stick to torch.