facebookresearch / AVID-CMA

Audio Visual Instance Discrimination with Cross-Modal Agreement
Other
127 stars 18 forks source link

Error in eval-action-recg-linear.py #13

Open HrkAsnm opened 2 years ago

HrkAsnm commented 2 years ago

Hello. Thank you for your excellent work and release of the code.

When I run python a code eval-action-recg-linear.py configs/benchmark/kinetics/8x224x224-linear.yaml configs/main/avid/kinetics/Cross-N1024.yaml, some errors occur in the "run_phase" function below.

Below is the location where the error code occurred and its output. The print is placed before and after the point where the error occurred.

error code

def run_phase(phase, loader, model, optimizer, epoch, args, cfg, logger):

    .
    .
    .

    total_loss = 0.
        for ft in feature_names:
            if phase == 'test_dense':
                confidence = softmax(logits[ft]).view(batch_size, clips_per_sample, -1).mean(1)
                target_tiled = target.unsqueeze(1).repeat(1, clips_per_sample).view(-1)
                loss = criterion(logits[ft], target_tiled)
            else:
                confidence = softmax(logits[ft])
                print(logits[ft])
                print(confidence)
                print(target)
                print('---------------')
                loss = criterion(logits[ft], target)
                print(loss)
            total_loss += loss

            with torch.no_grad():
                acc1, acc5 = metrics_utils.accuracy(confidence, target, topk=(1, 5))
                loss_meters[ft].update(loss.item(), target.size(0))
                top1_meters[ft].update(acc1[0].item(), target.size(0))
                top5_meters[ft].update(acc5[0].item(), target.size(0))

output

==============================   Test DB   ==============================
/home/haruka-asanuma/.pyenv/versions/anaconda3-2022.05/lib/python3.9/site-packages/torch/utils/data/dataloader.py:557: UserWarning: This DataLoader will create 32 worker processes in total. Our suggested max number of worker in current system is 16, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(_create_warning_msg(
Kinetics dataset
 - Root: /home/haruka-asanuma/kinetics-datasets//val
 - Subset: val
 - Num videos: 19881
 - Num samples: 497025
 - Example video: /home/haruka-asanuma/kinetics-datasets//val/EzAPYkvFxO0_000355_000365.mp4

==============================   Dense DB   ==============================
Kinetics dataset
 - Root: /home/haruka-asanuma/kinetics-datasets//val
 - Subset: val
 - Num videos: 19881
 - Num samples: 198810
 - Example video: /home/haruka-asanuma/kinetics-datasets//val/EzAPYkvFxO0_000355_000365.mp4

test: Epoch 12
tensor([[-0.4370, -2.4042, -0.4472,  ..., -0.2500, -2.4469,  0.0503],
        [-1.2650, -3.2401, -0.3476,  ...,  0.3754, -3.4689,  1.7811],
        [-1.1321, -4.3601, -1.2206,  ..., -1.4128, -2.2114,  1.1148],
        ...,
        [-0.7892, -3.8467, -1.6175,  ..., -0.8994, -1.7921,  0.7191],
        [-0.6565, -2.5375,  1.0547,  ..., -0.4171, -2.0549,  1.8172],
        [ 0.4295, -2.2780, -1.2851,  ..., -1.4298, -0.6527,  0.6132]],
       device='cuda:0')
tensor([[4.6641e-04, 6.5228e-05, 4.6168e-04,  ..., 5.6230e-04, 6.2498e-05,
         7.5932e-04],
        [1.8080e-04, 2.5084e-05, 4.5248e-04,  ..., 9.3242e-04, 1.9954e-05,
         3.8025e-03],
        [1.2092e-04, 4.7928e-06, 1.1068e-04,  ..., 9.1322e-05, 4.1092e-05,
         1.1437e-03],
        ...,
        [9.6051e-05, 4.5151e-06, 4.1955e-05,  ..., 8.6029e-05, 3.5234e-05,
         4.3407e-04],
        [3.6984e-04, 5.6375e-05, 2.0471e-03,  ..., 4.6987e-04, 9.1348e-05,
         4.3887e-03],
        [8.1401e-04, 5.4299e-05, 1.4655e-04,  ..., 1.2680e-04, 2.7584e-04,
         9.7820e-04]], device='cuda:0')
tensor([ 5289, 12931,  6744,  1889,  8891, 12426, 18293,  7740,  7090,   737,
         3557, 14430, 10911, 16911,  6179,  6593,     3,  3685,  2417,  7936,
         1608, 12762,  2093,  9390, 14598,  8638, 18721, 12621, 16732,  2105,
        17101,   970, 19066,  8943, 14840,  3280, 14341, 17823, 10266, 15639,
         5584, 16445, 18098, 12328,  1673, 14808,  8710, 15231,  1479,  5327,
        13081,   289, 18751,  7479,  4119,  5000, 18878,  2995,  7328,  1425,
         2274, 19566, 13748, 11803,  7019,  2386,  9278, 17079,  4871, 15091,
         7056,  9649,  3191, 14395,  7064, 14885,  6459,  5782, 14693, 16240,
         1542,  8260,   362, 10696, 15122,  8446, 17733,  8491,  1144,  1154,
        13995, 19726,  5870, 15542, 17442, 13262,  3836, 11575, 14413,  3346,
         3094,  4787,  9883, 16552, 17225, 14576, 16234, 15394, 19573, 16467,
         3717, 14414,  1112,  7343,   154,  1852, 12860, 15433,  7661,  6824,
         5161, 17000, 12098,  1200, 19550, 16283,  2416, 16018],
       device='cuda:0')
---------------

/opt/conda/conda-bld/pytorch_1656352660876/work/aten/src/ATen/native/cuda/Loss.cu:271: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1656352660876/work/aten/src/ATen/native/cuda/Loss.cu:271: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [1,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1656352660876/work/aten/src/ATen/native/cuda/Loss.cu:271: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [2,0,0] Assertion `t >= 0 && t < n_classes` failed.
/opt/conda/conda-bld/pytorch_1656352660876/work/aten/src/ATen/native/cuda/Loss.cu:271: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [3,0,0] Assertion `t >= 0 && t < n_classes` failed.

This means that this function is not working well. Why this happened? How did you make the score in READ ME ??

Thank you in advance.