the network structure of shadow detection

nachifur commented 3 years ago

The work is great! I can't find more information about the structure of detection. It seems only the following description.

Shadow Synthesis for Detection Our synthesized dataset also benefits to shadow detection. In detail, we modify our network structure to match the detection by removing the output of attention loss. Thus, we only use the mask as output for back-propagation.

removing the output of attention loss? I don’t understand what needs to be modified during detection compared to the network of shadow removal. Can you support more information to solve my doubts? Thank you so much!

vinthony commented 3 years ago

Hi, thanks for your question.

it seems to I only remove the Mask branch by removing L191 and set the output of L190 to 1. https://github.com/vinthony/ghost-free-shadow-removal/blob/426a91026bd53cc23e8f64ed87de4fe3b4dda527/networks.py#L190-L191

For the model I used actually, I think I don't have much time to find it before the deadline for CVPR.

But I will handle your issues(this and #22 ) at the end of November.

Also, please feel free to ask if you have more questions^^.

nachifur commented 3 years ago

Thank you veray much! I wish you success in submitting your paper for CVPR. Looking forward to your new work!

vinthony commented 3 years ago

Sorry for the reply late.

Here is the (almost) original version of our shadow detection model. However, you may send the path of vgg19 to fit the codebase in the open-sourced version function.

def build_detection_only(input,channel=64,reuse=False):

    print("[i] Hypercolumn ON, building hypercolumn features ... ")
    vgg19_features=build_vgg19(input[:,:,:,0:3]*255.0)
    for layer_id in range(1,6):
        vgg19_f = vgg19_features['conv%d_2'%layer_id]
        input = tf.concat([tf.image.resize_bilinear(vgg19_f,(tf.shape(input)[1],tf.shape(input)[2]))/255.0,input], axis=3)

    sf=slim.conv2d(input,channel,[1,1],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_sf')

    net0=slim.conv2d(sf,channel,[1,1],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv0')
    net1=slim.conv2d(net0,channel,[3,3],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv1')
    net = tf.concat([net0,net1],axis=3)
    net=se_block(net,'g_att0')
    net1=slim.conv2d(net,channel,[3,3],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_agg0')

    net2=slim.conv2d(net1,channel,[3,3],rate=2,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv2')
    net3=slim.conv2d(net2,channel,[3,3],rate=4,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv3')

    #agg
    net = tf.concat([net1,net3,net2],axis=3)
    net=se_block(net,'g_att2')
    net3=slim.conv2d(net,channel,[3,3],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_agg2')

    net4=slim.conv2d(net3,channel,[3,3],rate=8,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv4')
    net5=slim.conv2d(net4,channel,[3,3],rate=16,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv5')

    #agg
    net = tf.concat([net3,net5,net4],axis=3)
    net=se_block(net,'g_att4')
    net5=slim.conv2d(net,channel,[3,3],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_agg4')

    net6=slim.conv2d(net5,channel,[3,3],rate=32,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv6')
    net7=slim.conv2d(net6,channel,[3,3],rate=64,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv7')

    #agg
    net = tf.concat([net3,net5,net6,net7],axis=3)
    net7=se_block(net7,'g_att7')
    net=slim.conv2d(net,channel,[3,3],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv9')

    # here we build the pooling stack
    net_2 = tf.layers.average_pooling2d(net,pool_size=4,strides=4,padding='same')
    net_2 = slim.conv2d(net_2,channel,[1,1],activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_pool2')

    net_8 = tf.layers.average_pooling2d(net,pool_size=8,strides=8,padding='same')
    net_8 = slim.conv2d(net_8,channel,[1,1],activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_pool8')

    net_16 = tf.layers.average_pooling2d(net,pool_size=16,strides=16,padding='same')
    net_16 = slim.conv2d(net_16,channel,[1,1],activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_pool16')

    net_32 = tf.layers.average_pooling2d(net,pool_size=32,strides=32,padding='same')
    net_32 = slim.conv2d(net_32,channel,[1,1],activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_pool32')

    net = tf.concat([
      tf.image.resize_bilinear(net_2,(tf.shape(input)[1],tf.shape(input)[2])),
      tf.image.resize_bilinear(net_8,(tf.shape(input)[1],tf.shape(input)[2])),
      tf.image.resize_bilinear(net_16,(tf.shape(input)[1],tf.shape(input)[2])),
      tf.image.resize_bilinear(net_32,(tf.shape(input)[1],tf.shape(input)[2])),
      net],axis=3)

    net=slim.conv2d(net,1,[1,1],rate=1,activation_fn=None,scope='g_conv_last')

    return net

vinthony / ghost-free-shadow-removal

the network structure of shadow detection #23