RuntimeError: invalid argument 5: k not in range for dimension

WuChannn commented 4 years ago

@Duankaiwen hello, kaiwen

when i test my test dataset using my own trained model, i met with this problem, and the following is my log:

cfg_file: config/CenterNet-52.json loading all datasets... split: test loading from cache file: cache/ks_test.pkl loading annotations into memory... Done (t=1.83s) creating index... index created! system config... {'batch_size': 3, 'cache_dir': 'cache', 'chunk_sizes': [3], 'config_dir': 'config', 'data_dir': './data', 'data_rng': <mtrand.RandomState object at 0x7ff5fa1b58b8>, 'dataset': 'KS', 'decay_rate': 10, 'display': 50, 'learning_rate': 0.00025, 'max_iter': 480000, 'nnet_rng': <mtrand.RandomState object at 0x7ff5fa1b5900>, 'opt_algo': 'adam', 'prefetch_size': 6, 'pretrain': None, 'result_dir': 'results', 'sampling_function': 'kp_detection', 'snapshot': 5000, 'snapshot_name': 'CenterNet-52', 'stepsize': 450000, 'test_split': 'test', 'train_split': 'train', 'val_iter': 100, 'val_split': 'test', 'weight_decay': False, 'weight_decay_rate': 1e-05, 'weight_decay_type': 'l2'} db config... {'ae_threshold': 0.5, 'border': 128, 'categories': 6, 'data_aug': True, 'gaussian_bump': True, 'gaussian_iou': 0.7, 'gaussian_radius': -1, 'input_size': [511, 511], 'kp_categories': 1, 'lighting': True, 'max_per_image': 100, 'merge_bbox': False, 'nms_algorithm': 'exp_soft_nms', 'nms_kernel': 3, 'nms_threshold': 0.5, 'output_sizes': [[128, 128]], 'rand_color': True, 'rand_crop': True, 'rand_pushes': False, 'rand_samples': False, 'rand_scale_max': 1.4, 'rand_scale_min': 0.6, 'rand_scale_step': 0.1, 'rand_scales': array([0.6, 0.7, 0.8, 0.9, 1. , 1.1, 1.2, 1.3]), 'special_crop': False, 'test_scales': [0.5], 'top_k': 5, 'weight_exp': 8} loading parameters at iteration: 5000 building neural network... module_file: models.CenterNet-52 total parameters: 104787098 loading parameters... loading model from cache/nnet/CenterNet-52/CenterNet-52_5000.pkl locating kps: 0%| | 0/2772 [00:00<?, ?it/s]/root/data/anaconda2/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/upsampling.py:122: UserW arning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.")

Traceback (most recent call last): File "test.py", line 94, in test(testing_db, args.split, args.testiter, args.debug, args.suffix) File "test.py", line 61, in test testing(db, nnet, result_dir, debug=debug) File "/root/data/ks/code/CenterNet_duan/test/ks.py", line 321, in testing return globals()[system_configs.sampling_function](db, nnet, result_dir, debug=debug) File "/root/data/ks/code/CenterNet_duan/test/ks.py", line 129, in kp_detection dets, center = decode_func(nnet, images, K, ae_threshold=ae_threshold, kernel=nms_kernel) File "/root/data/ks/code/CenterNet_duan/test/ks.py", line 54, in kp_decode detections, center = nnet.test([images], ae_threshold=ae_threshold, K=K, kernel=kernel) File "/root/data/ks/code/CenterNet_duan/nnet/py_factory.py", line 114, in test return self.model(*xs, kwargs) File "/root/data/anaconda2/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, *kwargs) File "/root/data/ks/code/CenterNet_duan/nnet/py_factory.py", line 32, in forward return self.module(xs, kwargs) File "/root/data/anaconda2/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, kwargs) File "/root/data/ks/code/CenterNet_duan/models/py_utils/kp.py", line 290, in forward return self._test(*xs, *kwargs) File "/root/data/ks/code/CenterNet_duan/models/py_utils/kp.py", line 285, in _test return self._decode(outs[-8:], kwargs) File "/root/data/ks/code/CenterNet_duan/models/py_utils/kp_utils.py", line 148, in _decode scores, inds = torch.topk(scores, num_dets) RuntimeError: invalid argument 5: k not in range for dimension at /opt/conda/conda-bld/pytorch_1532581333611/work/aten/src/THC/generic/THCTensorTopK.c u:21

I'm looking forward to your reply. Thank u

Duankaiwen commented 4 years ago

@WuChannn 'num_dets' should be less than or equal to 'top_k' * 'top_k'

WuChannn commented 4 years ago

@Duankaiwen Hello, kaiwen. I change top_k to 6, and num_dets in the follwing code is 8, however the num of predicted center points is 0. def _decode( tl_heat, br_heat, tl_tag, br_tag, tl_regr, br_regr, ct_heat, ct_regr, K=100, kernel=1, ae_threshold=1, num_dets=8 ):

and the full log is:

cfg_file: config/CenterNet-52.json [6/1884] loading all datasets... split: test loading from cache file: cache/ks_test.pkl loading annotations into memory... Done (t=1.75s) creating index... index created! system config... {'batch_size': 3, 'cache_dir': 'cache', 'chunk_sizes': [3], 'config_dir': 'config', 'data_dir': './data', 'data_rng': <mtrand.RandomState object at 0x7fec9843c8b8>, 'dataset': 'KS', 'decay_rate': 10, 'display': 50, 'learning_rate': 0.00025, 'max_iter': 480000, 'nnet_rng': <mtrand.RandomState object at 0x7fec9843c900>, 'opt_algo': 'adam', 'prefetch_size': 6, 'pretrain': None, 'result_dir': 'results', 'sampling_function': 'kp_detection', 'snapshot': 5000, 'snapshot_name': 'CenterNet-52', 'stepsize': 450000, 'test_split': 'test', 'train_split': 'train', 'val_iter': 100, 'val_split': 'test', 'weight_decay': False, 'weight_decay_rate': 1e-05, 'weight_decay_type': 'l2'} db config... {'ae_threshold': 0.5, 'border': 128, 'categories': 6, 'data_aug': True, 'gaussian_bump': True, 'gaussian_iou': 0.7, 'gaussian_radius': -1, 'input_size': [511, 511], 'kp_categories': 1, 'lighting': True, 'max_per_image': 100, 'merge_bbox': False, 'nms_algorithm': 'exp_soft_nms', 'nms_kernel': 3, 'nms_threshold': 0.5, 'output_sizes': [[128, 128]], 'rand_color': True, 'rand_crop': True, 'rand_pushes': False, 'rand_samples': False, 'rand_scale_max': 1.4, 'rand_scale_min': 0.6, 'rand_scale_step': 0.1, 'rand_scales': array([0.6, 0.7, 0.8, 0.9, 1. , 1.1, 1.2, 1.3]), 'special_crop': False, 'test_scales': [0.5], 'top_k': 6, 'weight_exp': 8} loading parameters at iteration: 5000 building neural network... module_file: models.CenterNet-52 total parameters: 104787098 loading parameters... loading model from cache/nnet/CenterNet-52/CenterNet-52_5000.pkl locating kps: 0%| | 0/2772 [00:00<?, ?it/s]/root/data/anaconda2/envs/CenterNet/lib/python3.6/site-packages/torch/nn/modules/upsampling.py:122: UserW arning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.")

Traceback (most recent call last): File "test.py", line 94, in test(testing_db, args.split, args.testiter, args.debug, args.suffix) File "test.py", line 61, in test testing(db, nnet, result_dir, debug=debug) File "/root/data/ks/code/CenterNet_duan/test/ks.py", line 321, in testing return globals()[system_configs.sampling_function](db, nnet, result_dir, debug=debug) File "/root/data/ks/code/CenterNet_duan/test/ks.py", line 152, in kp_detection center_points = np.concatenate(center_points, axis=1) ValueError: need at least one array to concatenate

Duankaiwen commented 4 years ago

@WuChannn Comment out line 147 in test/coco.py

WuChannn commented 4 years ago

@Duankaiwen Thanks a lot. Hope to help others.

WuChannn commented 4 years ago

@Duankaiwen Hello, kaiwen.

When I try to test with my own images with my trained model, I always get nothing in the saved images and results.json. I print out the result of decode_func() function in line129 in test/coco.py, and I get:

dets: [[[ 3.2349873e+01  6.6429825e+01  1.5968109e+02  2.2367534e+02
    4.6260887e-01  4.7123554e-01  4.5398217e-01  3.0000000e+00]
  [ 3.2349873e+01  6.6429825e+01  1.3143454e+02  2.2368355e+02
    4.1034997e-01  4.7123554e-01  3.4946439e-01  3.0000000e+00]
  [ 3.2349873e+01  6.6429825e+01  1.5440904e+02  2.2367509e+02
    4.0282422e-01  4.7123554e-01  3.3441293e-01  3.0000000e+00]
  [ 8.8468765e+01  6.0515965e+01  1.5968109e+02  2.2367534e+02
   -1.0000000e+00  4.9183124e-01  4.5398217e-01  3.0000000e+00]
  [ 8.8468765e+01  6.0515965e+01  5.4405136e+01  2.2369923e+02
   -1.0000000e+00  4.9183124e-01  3.7416297e-01  3.0000000e+00]
  [ 8.8468765e+01  6.0515965e+01  1.3143454e+02  2.2368355e+02
   -1.0000000e+00  4.9183124e-01  3.4946439e-01  3.0000000e+00]
  [ 8.8468765e+01  6.0515965e+01  1.5968109e+02  2.2367534e+02
   -1.0000000e+00  4.9183124e-01  4.1429698e-01  3.0000000e+00]
  [ 8.8468765e+01  6.0515965e+01  4.5412842e+01  2.2369164e+02
   -1.0000000e+00  4.9183124e-01  3.3460695e-01  3.0000000e+00]]

 [[ 1.8584467e-03  7.0376465e+01  7.1446548e+01  2.2367661e+02
    4.0953285e-01  3.7950289e-01  4.3956283e-01  3.0000000e+00]
  [ 6.4745412e+00  5.5450581e+01  1.5968164e+02  2.2367366e+02
    3.9658430e-01  3.5217959e-01  4.4098902e-01  3.0000000e+00]
  [ 6.4745412e+00  5.5450581e+01  7.1446548e+01  2.2367661e+02
    3.9587122e-01  3.5217959e-01  4.3956283e-01  3.0000000e+00]
  [ 2.7386642e+01  7.0421577e+01  1.5968164e+02  2.2367366e+02
    3.9371496e-01  3.4644091e-01  4.4098902e-01  3.0000000e+00]
  [ 2.7386642e+01  7.0421577e+01  7.1446548e+01  2.2367661e+02
    3.9300185e-01  3.4644091e-01  4.3956283e-01  3.0000000e+00]
  [ 1.8584467e-03  7.0376465e+01  6.2447205e+01  2.2368639e+02
    3.8339013e-01  3.7950289e-01  3.8727733e-01  3.0000000e+00]
  [ 1.8584467e-03  7.0376465e+01  8.8428604e+01  2.2368767e+02
    3.8202363e-01  3.7950289e-01  3.8454437e-01  3.0000000e+00]
  [ 1.8584467e-03  7.0376465e+01  1.2541402e+02  2.2366917e+02
    3.8006711e-01  3.7950289e-01  3.8063130e-01  3.0000000e+00]]]
center: [[[1.2947668e+02 1.1746306e+02 3.0000000e+00 1.6965330e-01]
  [1.2948651e+02 1.2747198e+02 3.0000000e+00 1.6731572e-01]
  [1.2948158e+02 1.2247352e+02 3.0000000e+00 1.5981433e-01]
  [1.2949446e+02 1.4548326e+02 3.0000000e+00 1.5187018e-01]
  [1.2950827e+02 1.0249621e+02 3.0000000e+00 1.5074971e-01]
  [1.2948874e+02 1.6947191e+02 3.0000000e+00 1.4942038e-01]]

 [[1.3146968e+02 1.2945517e+02 3.0000000e+00 1.7543328e-01]
  [1.3248714e+02 1.4547208e+02 3.0000000e+00 1.7366149e-01]
  [1.3247891e+02 1.3347208e+02 3.0000000e+00 1.7068261e-01]
  [1.3147018e+02 1.0245686e+02 3.0000000e+00 1.6493557e-01]
  [1.3248189e+02 1.1647341e+02 3.0000000e+00 1.6448633e-01]
  [1.3147867e+02 9.6464920e+01 3.0000000e+00 1.5871526e-01]]]

could you please give a brief explanation for each element in the vector? and I would like to know why the last elem in dets and the third elem in center are always 3.

I also print out detections before valid_ind = detections[:,4]> -1, I got the fifth elem always be -1, so the valid_detections should be [] after valid_ind = detections[:,4]> -1, then I get noting in the result and there is an IndexError in the Traceback.

detections: [[ 1.22197296e+02 2.34422729e+02 6.24000000e+02 8.32000000e+02 -1.00000000e+00 4.71235543e-01 4.53982174e-01 3.00000000e+00] [ 1.22197296e+02 2.34422729e+02 5.17916687e+02 8.32000000e+02 -1.00000000e+00 4.71235543e-01 3.49464387e-01 3.00000000e+00] [ 1.22197296e+02 2.34422729e+02 6.09671082e+02 8.32000000e+02 -1.00000000e+00 4.71235543e-01 3.34412932e-01 3.00000000e+00] [ 3.46322113e+02 2.10793686e+02 6.24000000e+02 8.32000000e+02 -1.00000000e+00 4.91831243e-01 4.53982174e-01 3.00000000e+00] [ 3.46322113e+02 2.10793686e+02 2.10280502e+02 8.32000000e+02 -1.00000000e+00 4.91831243e-01 3.74162972e-01 3.00000000e+00] [ 3.46322113e+02 2.10793686e+02 5.17916687e+02 8.32000000e+02 -1.00000000e+00 4.91831243e-01 3.49464387e-01 3.00000000e+00] [ 3.46322113e+02 2.10793686e+02 6.24000000e+02 8.32000000e+02 -1.00000000e+00 4.91831243e-01 4.14296985e-01 3.00000000e+00] [ 3.46322113e+02 2.10793686e+02 1.74367523e+02 8.32000000e+02 -1.00000000e+00 4.91831243e-01 3.34606946e-01 3.00000000e+00] [ 3.46660339e+02 2.50191681e+02 6.24000000e+02 8.32000000e+02 -1.00000000e+00 3.79502892e-01 4.39562827e-01 3.00000000e+00] [ 0.00000000e+00 1.90554764e+02 6.06142273e+02 8.32000000e+02 -1.00000000e+00 3.52179587e-01 4.40989017e-01 3.00000000e+00] [ 3.46660339e+02 1.90554764e+02 6.06142273e+02 8.32000000e+02 -1.00000000e+00 3.52179587e-01 4.39562827e-01 3.00000000e+00] [ 0.00000000e+00 2.50371918e+02 5.22624573e+02 8.32000000e+02 -1.00000000e+00 3.46440911e-01 4.40989017e-01 3.00000000e+00] [ 3.46660339e+02 2.50371918e+02 5.22624573e+02 8.32000000e+02 -1.00000000e+00 3.46440911e-01 4.39562827e-01 3.00000000e+00] [ 3.82601471e+02 2.50191681e+02 6.24000000e+02 8.32000000e+02 -1.00000000e+00 3.79502892e-01 3.87277335e-01 3.00000000e+00] [ 2.78838257e+02 2.50191681e+02 6.24000000e+02 8.32000000e+02 -1.00000000e+00 3.79502892e-01 3.84544373e-01 3.00000000e+00] [ 1.31127762e+02 2.50191681e+02 6.24000000e+02 8.32000000e+02 -1.00000000e+00 3.79502892e-01 3.80631298e-01 3.00000000e+00]]

Traceback (most recent call last): File "test.py", line 94, in <module> test(testing_db, args.split, args.testiter, args.debug, args.suffix) File "test.py", line 61, in test testing(db, nnet, result_dir, debug=debug) File "/root/data/ks/code/CenterNet_duan/test/ks.py", line 327, in testing return globals()[system_configs.sampling_function](db, nnet, result_dir, debug=debug) File "/root/data/ks/code/CenterNet_duan/test/ks.py", line 323, in kp_detection db.evaluate(result_json, cls_ids, image_ids) File "/root/data/ks/code/CenterNet_duan/db/ks.py", line 173, in evaluate coco_dets = coco.loadRes(result_json) File "data/coco/PythonAPI/pycocotools/coco.py", line 318, in loadRes if 'caption' in anns[0]: IndexError: list index out of range

look forwars to your reply, thanks a lot.

WuChannn commented 4 years ago

The situation happened mainly due to the wrong ground truth: (x, y, w, h), not (x1, y1, x2, y2)

Duankaiwen commented 4 years ago

@WuChannn
1.'top_k' maybe too small.

how much iter does your model train? 3.. the class number in your own dataset should start from 1.
modify some codes in line 48 in db/coco.py to adapt to your own dataset.

Duankaiwen commented 4 years ago

@WuChannn Oh，I see

WuChannn commented 4 years ago

@Duankaiwen Hello, kaiwen

Actually, I wanna know whether 'kp_detection' means key point detection? and why "kp_categories" is set to 1 in CenterNet-xx.json?

Also where to find 'db.class_name' definition in 'cat_name = db.class_name(j)' in test/coco.py?

thanks a lot

Duankaiwen commented 4 years ago

@WuChannn 'kp_detection' is just a function name in sample/coco.py, ‘kp_categories’ is not used, you can delete it. 'db.class_name' is defined in db/coco.py, you can step through each line of code by pdb.

WuChannn commented 4 years ago

@Duankaiwen Hello, kaiwen:

I came across a strange problem: when I set test_scales to [1], and I get nothing in the saved results.json in debug mode, and come up

Traceback (most recent call last): 
File "test.py", line 94, in <module> test(testing_db, args.split, args.testiter, args.debug, args.suffix) 
File "test.py", line 61, in test testing(db, nnet, result_dir, debug=debug) 
File "/root/data/ks/code/CenterNet_duan/test/ks.py", line 327, in testing 
return globals()[system_configs.sampling_function](db, nnet, result_dir, debug=debug) 
File "/root/data/ks/code/CenterNet_duan/test/ks.py", line 323, in kp_detection 
db.evaluate(result_json, cls_ids, image_ids) 
File "/root/data/ks/code/CenterNet_duan/db/ks.py", line 173, in evaluate 
coco_dets = coco.loadRes(result_json) 
File "data/coco/PythonAPI/pycocotools/coco.py", line 318, in loadRes 
if 'caption' in anns[0]: IndexError: list index out of range

However, when I set test_scales to [0.1], and I get something in the saved results.json in debug mode, though they are wrong result.

I can't figure out why, so I refer to your help. Looking forward to your help.

Duankaiwen / CenterNet

RuntimeError: invalid argument 5: k not in range for dimension #131