Closed YellowKyu closed 5 years ago
Hey @YellowKyu I did try the mobilenet_v1 but I experienced nan loss very early in the training stage. I tried to reduce the learning rate but it didn't help. I also tried experimenting with the Xception like network as mentioned in the paper, faced the same issue though. Let me know if you find something.
@karansomaiah Hi there ! which feature maps are you feeding to the RPN and the large separable convolution ? I have high loss (like around 10~15) with the Shufflenet but not NaN ...
Have you solved it? @YellowKyu
These are the blocks:
blocks = [
resnet_utils.Block('block1', bottleneck,
[(144, 24, 2, 1)] + [(144, 24, 1, 1)] * 3),
resnet_utils.Block('block2', bottleneck,
[(288, 144, 2, 1)] + [(288, 144, 1, 1)] * 7),
resnet_utils.Block('block3', bottleneck,
[(576, 288, 1, 1)] + [(576, 288, 1, 1)] * 3)
]
And I was passing block2 features to the RPN Also, digging into the PSAlign code, I feel the loss is high because of the hard coded spatial scale for resnet101 in the original code. Appropriate scaling for the reduced size of the feature maps will fix the issue.
hi @karansomaiah ,
For mobilenet, I fed Conv8_pointwise to the RPN and Conv11_pointwise to the large separable conv and it converged nicely. For Shufflenet, I also succeed to make it converge but I only used Stage3 for both RPN and large separable conv. I noticed that it is related to the resolution of my features maps, which is also related to what you discovered with PSAlign. Did you try to modify PSAlign ?
Hey guys,
Anyone tried to replace the backbone by something like a Shufflenet or Mobilenet ? Since the Xception model is not released maybe it could be a good alternative to improve the inference speed ! I'm trying to add the
architecture.py
from https://github.com/TropComplique/shufflenet-v2-tensorflow to network_desp.py but during the training therpn_cls_loss
seems to be switching between 0.5, 0.6, 0.7, 0.8 and 0.9 without decreasing further....Thanks for your help !