Closed Wisgon closed 6 years ago
hi @Wisgon,
num_pixels = (pixel_height, pixel_width)
, where pixel_height
or pixel_width
is the number of pixel that you want to pad your output dimension after 2 times downsampling (2 VGG-16 max-pooling). I rather upscale feature maps instead of zero-pad the feature maps, in my experiment. But you can skip MyUpSampling2D
and use Keras ZeroPadding2D
instead.
More detail: If your input dimension, say 240x320, so after 2 times downsampling by the encoder, your output will be 60x80. So, after 2 times upsampling by the decoder, your output will be 240x320. In this case, you don't need to use ZeroPadding
or MyUpSampling
. But if the output from the encoder, say 60x79, you need to pad 1 width-pixel, e.g. x = MyUpSampling2D(num_pixels=(0, 1), method_name=self.method_name)(x)
OK, I will try it later, thank you very much.
Hello @lim-anggun, thanks for sharing this great work. I have a question about your design architecture / your problem definition. do you use features accross time (temporal features) for segmenting the background and the foreground? because from your paper, your problem definition seems like semantic segmentation for me. thank you very much. best regards, albert christianto
As I know, if I want to train with my own video sequence, I should manually config FgSegNetModule.py. But I'm a newbie on keras even on deep learning. I found that I should modify the code bellow to fit my input video:
But I don't know what num_pixels I should pass to it... How can I know what num_pixels corresponding my video sequence?And under what situation I should use Cropping2D()? And is there anything I should modify? Thank you very much for replying.