Open ha5463 opened 6 years ago
@ha5463 You should use a mobilenet base, and initialize the mobilenet with imagenet weights. They are freely available. Mobilenet uses SeparableConv2D layers.
@JonathanCMitchell Thankyou for the reply, since we do not have mobilenet base here, if you know any such repository, it would be great if you can direct to one such repo.
@ha5463 It is available in keras.applications here
@JonathanCMitchell , as you said I looked into MobileNet, though the network is quite good, we still need ResNET using SeparableConv2D. On replacing some layers in model.py (in this repo itself), I got an error Layer #18 (named "res2b_branch2a") expects 3 weight(s), but the saved weights have 2 element(s).
Can you please help me with this?
For reference the complete log is as follows: ValueError Traceback (most recent call last)
If I understand correctly, you're trying to use the same ResNet50 network, but change the regular convolutions to separable convolutions. I don't think you can use the provided weights for this case. Separable convolutions have a different structure, so the weights from the regular convolutions won't be that useful anyway. I think your best option is to train training from scratch.
@waleedka sir, thank you for understanding and pointing out the exact problem which we are trying to come over, which is to train the new model ( ResNet50 with SeparableConv2D layers) with random weights. We are not able to initialize this new model using random weights and start training from scratch, if you know any resource which we can refer to for coming over the problem, it would be very helpful.
ThankYou
@ha5463 Did you comment out the line L470 that has model.load_weights
. I would modify it to become model.load_weights(<path>)
instead of model.load_weights(<path>, byName=True)
because the byName argument will fail when you try to load custom layers. You should also keep track of the actual layer names in layer_regex
inside model.py. Your SeparableConv2D layers should try to meet the same layer naming pattern as resnet but to be sure they are loading put a breakpoint before the fit_generator loads and check the trainable
property on your layers.
You should also use a trainable BatchNormalization
layer instead of the defined BatchNorm
layer that overwrites keras.layers.BatchNormalization layers.
You should also use a trainable BatchNormalization layer instead of the defined BatchNorm layer that overwrites keras.layers.BatchNormalization layers.
As of the latest update (a day or two ago), there is a config setting, TRAIN_BN
, that makes it easy to enable/disable batch normalization training without updating the code.
@ha5463 If you want to start with random weights then simple comment out the line that loads the weights (@JonathanCMitchell links to it in the comment above). By default, if you don't load any weights, then you're starting with random weights.
@waleedka Is calling an instance of BatchNorm
with training=False
the same as constructing KL.BatchNormalization
with thetrainable=False
parameter? The Keras documentation doesn't mention training=False
, but does mention it for freezing layers.
If so, how do the parameters interact with each other? Does calling it with training=?
always override what's it's constructed with?
We would like if anyone can guide us on how to initialize weights for training from scratch. We are planning to replace the Conv2D layers with SeparableConv2D layers so we cant use the previous ".h5" file for current purpose.
ThankYou