facebookresearch / deepcluster

Deep Clustering for Unsupervised Learning of Visual Features
Other
1.66k stars 324 forks source link

Resuming from checkpoint error : RuntimeError: OrderedDict mutated during iteration #47

Closed AhmadM-DL closed 4 years ago

AhmadM-DL commented 4 years ago

Hello there,

I am having the following error when resuming from a checkpoint.

It turns out one can't pop elements from a dictionary while iterating it - according to the following stack overflow answer.

This happens in the main.py file in checkpoint resuming block.

for key in checkpoint['state_dict']:           # Line 101
    if 'top_layer' in key:                     # Line 102
        del checkpoint['state_dict'][key]      # Line 103

This is supposed to be solved by iterating over a copy of the state_dict rather than the original one as following:

    for key in checkpoint['state_dict'].copy():   # <------------
      if 'top_layer' in key:
        del checkpoint['state_dict'][key]

Please correct me if I am wrong.

mathildecaron31 commented 4 years ago

Hi, This error doesn't appear in the python and pytorch versions I used for my experiments. However if you use python3 the following code should work:

# remove top_layer parameters from checkpoint          
state_dict = checkpoint['state_dict'].copy()     
to_delete = []                                            
for key in state_dict:                           
    if 'top_layer' in key:                                  
        to_delete.append(key)        
for key in to_delete:                                          
    del state_dict[key]                                       
model.load_state_dict(state_dict)