regarding loading_weights

wenouyang commented 7 years ago

Hi nicolov,

In the train.py, you have included the function of load_weights(model, weights_path):. My understanding is that you are trying to load a pre-training vcg model. If I do not want to use this pretrained model because the problem I am working one may belong to a totally different domain, should I just skip calling this load_weights function? Or using a pre-trained model is always preferable, I am kind of confusing about this.

In the notes, you mentioned that The training code is currently limited to the frontend module, and thus only outputs 16x16 segmentation maps. If I would like to leverage this code for my own data set, what are the modifications that I have to make? Do I still have to load the weights?

Thank you very much!

nicolov commented 7 years ago

Generally speaking, it makes sense to start from the pretrained model. The weights you're loading are only for the first few convolutional layers that generalize well across domains. Training from scratch is pretty tricky, and I'm not sure you'd want to go there.

On Sunday, 12 February 2017, wenouyang notifications@github.com wrote:

Hi nicolov,

In the train.py, you have included the function of load_weights(model, weights_path):. My understanding is that you are trying to load a pre-training vcg model. If I do not want to use this pretrained model because the problem I am working one may belong to a totally different domain, should I just skip calling this load_weights function? Or using a pre-trained model is always preferable, I am kind of confusing about this.

Thanks,

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nicolov/segmentation_keras/issues/6, or mute the thread https://github.com/notifications/unsubscribe-auth/AKO4B_Lor5yXR1Miv4ocjGvr7xP_4UZ-ks5rbf4HgaJpZM4L-QMO .

wenouyang commented 7 years ago

Hi nicolov, thanks for the reply. Generally, if a network architecture is built based on an pre-existing network, such as imagenet or vgg, it is always desirable to initialize the weight using pre-existing model. Is that a common rule for training DL model? Generally, what are those tricky issues when training from scratch, how to to handle them?

Besides, with respect to dilated convolution, it seems to me that the paper referred designs a specific context module to use it. In generic, can I use dilated convolution as follows, I just replace normal convolution layer with dilated convolution. Is that a reasonable approach? Thanks.

nicolov commented 7 years ago

Actually, the reasoning goes the other way around: people build their architectures on top of VGG so that they're able to use the pretrained weights. Training from scratch requires ImageNet-scale data and is sensitive to hyper-parameters.

For your second question, the answer is yes. Just pay attention to the image size at each layer.

On 12 February 2017 at 13:50, wenouyang notifications@github.com wrote:

Hi nicolov, thanks for the reply. Generally, if a network architecture is built based on an pre-existing network, such as imagenet or vgg, it is always desirable to initialize the weight using pre-existing model. Is that a common rule for training DL model? Generally, what are those tricky issues when training from scratch, how to to handle them?

Besides, with respect to dilated convolution, it seems to me that the paper referred designs a specific context module to use it. In generic, can I use dilated convolution as follows, I just replace normal convolution layer with dilated convolution. Is that a reasonable approach? Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nicolov/segmentation_keras/issues/6#issuecomment-279198739, or mute the thread https://github.com/notifications/unsubscribe-auth/AKO4BwieV2u5CyllQSLrkXbtPsJwiAd0ks5rbp2ggaJpZM4L-QMO .

wenouyang commented 7 years ago

Thanks a lot!

nicolov / segmentation_keras

regarding loading_weights #6