How does fine-tuning work?

   def load_tuning_state(self, path,):
   if http in path:
        state = torch.hub.load_state_dict_from_url(path, map_location='cpu')
    else:
        state = torch.load(path, map_location='cpu')

    module = dist.de_parallel(self.model)

    # TODO hard code
    if 'ema' in state:
        stat, infos = self._matched_state(module.state_dict(), state['ema']['module'])
    else:
        stat, infos = self._matched_state(module.state_dict(), state['model'])

    module.load_state_dict(stat, strict=False)
    print(f'Load model.state_dict, {infos}')

` In the above code, while loading a weight parameter file which is trained on COCO dataset, how does it adjust for the last layer? Say In my architecture, I have defined the num_classes = 1. My last layer should have a single weight but while loading from COCO weights it has 80 weights, so how exactly does this happen?

I also wanted to understand - so if i give a all together different weights file ( trained on some different architecture ) , even then will it load the weights for the matched state? If no, then why not?

lyuwenyu / RT-DETR

How does fine-tuning work? #167