Many_shot_accuracy_top1: nan on my own dataset

Hi,

I am using your code to run on my custom dataset. However, I am getting nan on Many_shpt_accuracy_top1 on val set. I tried changing the learning rate too, but it doesn't seem to work.

{'criterions': {'PerformanceLoss': {'def_file': 'loss/SoftmaxLoss.py',
                                    'loss_params': {},
                                    'optim_params': None,
                                    'weight': 1.0}},
 'memory': {'centroids': False, 'init_centroids': False},
 'networks': {'classifier': {'def_file': 'models/DotProductClassifier.py',
                             'optim_params': {'lr': 1,
                                              'momentum': 0.9,
                                              'weight_decay': 0},
                             'params': {'dataset': 'ImageNet_LT',
                                        'in_dim': 512,
                                        'num_classes': 201,
                                        'stage1_weights': False}},
              'feat_model': {'def_file': 'models/ResNet10Feature.py',
                             'fix': False,
                             'optim_params': {'lr': 1,
                                              'momentum': 0.1,
                                              'weight_decay': 0},
                             'params': {'caffe': False,
                                        'dataset': 'ImageNet_LT',
                                        'dropout': None,
                                        'stage1_weights': False,
                                        'use_fc': False,
                                        'use_modulatedatt': False}}},
 'training_opt': {'batch_size': 128,
                  'dataset': 'ImageNet_LT',
                  'display_step': 10,
                  'feature_dim': 512,
                  'log_dir': 'logs/ImageNet_LT/stage1',
                  'num_classes': 201,
                  'num_epochs': 500,
                  'num_workers': 4,
                  'open_threshold': 0.01,
                  'sampler': None,
                  'scheduler_params': {'gamma': 0.1, 'step_size': 1000}}}
Loading data from /pylon5/pscstaff/rajanie/MLStamps/long-tail/OpenLongTailRecognition-OLTR/OLTRDataset/OLTRDataset_1/train.txt
Use data transformation: Compose(
    RandomResizedCrop(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=PIL.Image.BILINEAR)
    RandomHorizontalFlip(p=0.5)
    ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4, hue=0)
    ToTensor()
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)
No sampler.
Shuffle is True.
Loading data from /pylon5/pscstaff/rajanie/MLStamps/long-tail/OpenLongTailRecognition-OLTR/OLTRDataset/OLTRDataset_1/val.txt
Use data transformation: Compose(
    Resize(size=256, interpolation=PIL.Image.BILINEAR)
    CenterCrop(size=(224, 224))
    ToTensor()
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)
No sampler.
Shuffle is True.
Using 1 GPUs.
Loading Scratch ResNet 10 Feature Model.
No Pretrained Weights For Feature Model.
Loading Dot Product Classifier.
Random initialized classifier weights.
Using steps for training.
Initializing model optimizer.
Loading Softmax Loss.

 Phase: val 

 Evaluation_accuracy_micro_top1: 0.312 
 Averaged F-measure: 0.100 
 Many_shot_accuracy_top1: nan Median_shot_accuracy_top1: 0.630 Low_shot_accuracy_top1: 0.096 

Epoch: [72/500] Step: 1  Minibatch_loss_performance: 2.645 Minibatch_accuracy_micro: 0.344
Epoch: [72/500] Step: 2  Minibatch_loss_performance: 2.499 Minibatch_accuracy_micro: 0.344
Epoch: [72/500] Step: 3  Minibatch_loss_performance: 2.736 Minibatch_accuracy_micro: 0.305
Epoch: [72/500] Step: 4  Minibatch_loss_performance: 2.587 Minibatch_accuracy_micro: 0.336
Epoch: [72/500] Step: 5  Minibatch_loss_performance: 2.884 Minibatch_accuracy_micro: 0.289
Epoch: [72/500] Step: 6  Minibatch_loss_performance: 2.863 Minibatch_accuracy_micro: 0.281
Epoch: [72/500] Step: 7  Minibatch_loss_performance: 2.513 Minibatch_accuracy_micro: 0.367
Epoch: [72/500] Step: 8  Minibatch_loss_performance: 2.918 Minibatch_accuracy_micro: 0.312
Epoch: [72/500] Step: 9  Minibatch_loss_performance: 2.623 Minibatch_accuracy_micro: 0.352

Can you give me any pointers to debug this? I didn't make any changes to the code.

zhmiao / OpenLongTailRecognition-OLTR

Many_shot_accuracy_top1: nan on my own dataset #64