Couldn't Evaluate the Predictions Generated

Apologies for the long post and my ignorance.

Generating Predictions: I downloaded the audio files using the scripts from the mentioned GitHub directory. After that I generated the predictions using the following command.

python3 --summaries "./Weights/vggsound_avgpool.pth.tar" --pool "avgpool"  --batch_size=1

While generating the predictions I got the follwoing error.

  File "/usr/local/lib/python3.6/dist-packages/scipy/signal/spectral.py", line 1757, in _spectral_helper
    raise ValueError('noverlap must be less than nperseg.')

Solved the error using the default (nperseg//8) value for noverlap. But got the following warnings.

UserWarning: nperseg = 256 is greater than input length  = 20, using nperseg = 20
  .format(nperseg, input_length))

Had to make the following change in the file test.py to line number 108.

//_Original_
aud_o = model(spec.unsqueeze(1).float())
//_Changed to_
aud_o = model(spec.unsqueeze(1).squeeze(-1).float())

Otherwise it was giving the following error.

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 1, 7, 7], but got 5-dimensional input of size [1, 1, 160000, 11, 1] instead

Evaluating: While evaluating the predictions using the script eval.py it showed the following errors.

RuntimeWarning: invalid value encountered in true_divide
  recall = tps / tps[-1]
Traceback (most recent call last):
  File "eval.py", line 71, in 
    main()
  File "eval.py", line 64, in main
    mAUC = np.mean([stat['auc'] for stat in stats])
  File "eval.py", line 64, in 
    mAUC = np.mean([stat['auc'] for stat in stats])
KeyError: 'auc'

I didn't download the whole dataset. I downloaded a part(2231) of it and generated my own _mytest.csv from the downloaded files. The audio files were downloaded in .flac format and then converted to .wav

Would you please tell me what am I doing wrong? I am new to Deep Learning research arena so please do pardon my ignorance.

hche11 / VGGSound

Couldn't Evaluate the Predictions Generated #3