aim-uofa / AdelaiDet

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
https://git.io/AdelaiDet
Other
3.37k stars 646 forks source link

training blendmask unmatched keys #544

Closed an99990 closed 2 years ago

an99990 commented 2 years ago

Hi, I am trying to train Blendmask/DLA_34_4x using the given weights, and i get this following warning

The checkpoint state_dict contains keys that are not used by the model:
  basis_module.seg_head.0.weight
  basis_module.seg_head.1.{bias, num_batches_tracked, running_mean, running_var, weight}
  basis_module.seg_head.3.weight
  basis_module.seg_head.4.{bias, num_batches_tracked, running_mean, running_var, weight}
  basis_module.seg_head.6.{bias, weight}

This is how i load the model and the weights

cfg = get_cfg()
cfg.merge_from_file("/models/AdelaiDet/configs/BlendMask/DLA_34_4x.yaml")

cfg.merge_from_list(["MODEL.WEIGHTS","/models_weights/DLA_34_4x.pth"])
cfg.SOLVER.IMS_PER_BATCH = 8
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 15
cfg.OUTPUT_DIR = "/opt/le/train2"
cfg.MODEL.BASIS_MODULE.LOSS_ON = False
cfg.freeze()
rank = comm.get_rank()
setup_logger(cfg.OUTPUT_DIR, distributed_rank=rank, name="adet")

dist.init_process_group(backend='nccl', init_method='tcp://localhost:23456', rank=0, world_size=1)
# print(cfg)
trainer = Trainer(cfg)
print(trainer.model)
trainer.resume_or_load()
train_results = trainer.train()

When I tried using the given pth in the model i get the following

<Some model parameters or buffers are not found in the checkpoint:
basis_module.seg_head.0.weight
basis_module.seg_head.1.{bias, running_mean, running_var, weight}
basis_module.seg_head.3.weight
basis_module.seg_head.4.{bias, running_mean, running_var, weight}
basis_module.seg_head.6.{bias, weight}>

SO i tried with another models and i get the same issue with basi_module.seg_head not being able to load/ or missing keys .. @Yuliang-Liu do you have any idea why ? Thank you for any help

an99990 commented 2 years ago

i found the issue was that i set cfg.MODEL.BASIS_MODULE.LOSS_ON = False