facebookresearch / Detectron

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Apache License 2.0
26.22k stars 5.45k forks source link

VGG16 with FPN: Workspace blob fc6_w with shape(4096, 50176) doesn't match weights file shape(4096, 25088) #908

Open lunaQi opened 5 years ago

lunaQi commented 5 years ago

I have use detectron to train a faster rcnn model using vgg16 backbone with FPN. I set the FPN dim 256, and the roi transform output resolution is 14 by default. Meanwhile, the fully connect layer in fastrcnn is 2mlp_head with 4096 dim each layer. So I need a (4096, 1414256=50176) fc6_w blob. However, when load the pretrained VGG16 weights file, the fc6_w layer is (4096, 25088), which is trained by 77512 original network.
Therefore, I hope to replace the original fc layer to new fc layer, whose workspace blob is correct, just like the resnet in faster rcnn. But I didn't figure it out when I read the utils/net.py initialize_gpu_from_weights_file() func, which part do I need to change to stop load the fc layer from the weights file.

Expected results:

ws_blobs = workspace.Blobs() src_blobs = load_object(weights_file)

if 'cfg' in src_blobs:
    saved_cfg = load_cfg(src_blobs['cfg'])
    configure_bbox_reg_weights(model, saved_cfg)
if 'blobs' in src_blobs:
    # Backwards compat--dictionary used to be only blobs, now they are
    # stored under the 'blobs' key
    src_blobs = src_blobs['blobs']
# Initialize weights on GPU gpu_id only
unscoped_param_names = OrderedDict()  # Print these out in model order
for blob in model.params:
    unscoped_param_names[c2_utils.UnscopeName(str(blob))] = True
with c2_utils.NamedCudaScope(gpu_id):
    for unscoped_param_name in unscoped_param_names.keys():
        if (unscoped_param_name.find(']_') >= 0 and
                unscoped_param_name not in src_blobs):
            # Special case for sharing initialization from a pretrained
            # model:
            # If a blob named '_[xyz]_foo' is in model.params and not in
            # the initialization blob dictionary, then load source blob
            # 'foo' into destination blob '_[xyz]_foo'
            src_name = unscoped_param_name[
                unscoped_param_name.find(']_') + 2:]
        else:
            src_name = unscoped_param_name
        if src_name not in src_blobs:
            logger.info('{:s} not found'.format(src_name))
            continue
        dst_name = core.ScopedName(unscoped_param_name)
        has_momentum = src_name + '_momentum' in src_blobs
        has_momentum_str = ' [+ momentum]' if has_momentum else ''
        logger.info(
            '{:s}{:} loaded from weights file into {:s}: {}'.format(
                src_name, has_momentum_str, dst_name, src_blobs[src_name]
                .shape
            )
        )
        if dst_name in ws_blobs:
            # If the blob is already in the workspace, make sure that it
            # matches the shape of the loaded blob
            ws_blob = workspace.FetchBlob(dst_name)
            assert ws_blob.shape == src_blobs[src_name].shape, \
                ('Workspace blob {} with shape {} does not match '
                 'weights file shape {}').format(
                    src_name,
                    ws_blob.shape,
                    src_blobs[src_name].shape)

Can anybody help me to understand what each line code is actually doing or just told me how to change the code to load specified layer weights? Many thanks~~~