VGGish smoke test failure with embedding mismatch

2017csz0006 commented 4 years ago

I am following the demo google colab tutorial provided on ''https://github.com/tensorflow/models/tree/master/research/audioset/vggish'' to extract embeddings. I am getting errors on cell 32 as

**''Traceback (most recent call last): File "", line 3, in np.testing.assert_allclose(

File "/home/pratibha/anaconda3/envs/vggish_env/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 1527, in assert_allclose assert_array_compare(compare, actual, desired, err_msg=str(err_msg),

File "/home/pratibha/anaconda3/envs/vggish_env/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 840, in assert_array_compare raise AssertionError(msg)

AssertionError: Not equal to tolerance rtol=0.1, atol=0

Mismatched elements: 1 / 2 (50%) Max absolute difference: 18.37231736 Max relative difference: 0.24496423 x: array([121.96875 , 93.372317]) y: array([123., 75.])''**

jvishnuvardhan commented 2 years ago

@2017csz0006 Sorry for the late response. Is this still an issue for you. I ran the colab and don't notice any error. Please check the gist here.

Please close the issue if this was already resolved for you. Thanks!

google-ml-butler[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] commented 2 years ago

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 2 years ago

Are you satisfied with the resolution of your issue? Yes No

mbotler commented 2 years ago

@jvishnuvardhan the issue still persists. I just ran the notebook you linked and the AssertionError occurred

dpwe commented 2 years ago

Can you send the exact error message you're getting now? It looks like the target values have been modified to 122 and 93, which are closer to the 121.96 and 93.37 of your original error report.

This feels like a numerical issue. Can you give us the details of your hardware/platform? My guess is that the system is working, it's just that the smoke test is being too strict. What happens if you continue with the colab?

Thanks,

DAn.

On Mon, Aug 8, 2022 at 9:59 AM mbotler @.***> wrote:

@jvishnuvardhan https://github.com/jvishnuvardhan the issue still persists. I just ran the notebook you linked and the AssertionError occurred

— Reply to this email directly, view it on GitHub https://github.com/tensorflow/models/issues/9248#issuecomment-1208168283, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEGZUKVXZUHS2B5NWDMQJDVYEHEDANCNFSM4RMTNGMA . You are receiving this because you were assigned.Message ID: @.***>

mbotler commented 2 years ago

cell that causes the error:

# Run the test, which also loads all the necessary functions.
from vggish_smoke_test import *

error message:

Testing your install of VGGish

Log Mel Spectrogram example:  [[-4.48303472 -4.2711199  -4.17038671 ... -4.59048271 -4.56833283
  -4.53160164]
 [-4.48303714 -4.27112599 -4.1703935  ... -4.59316649 -4.59210697
  -4.58855613]
 [-4.48303714 -4.27112599 -4.1703935  ... -4.59316649 -4.59210697
  -4.58855613]
 ...
 [-4.48303714 -4.27112599 -4.1703935  ... -4.59316649 -4.59210697
  -4.58855613]
 [-4.48303714 -4.27112599 -4.1703935  ... -4.59316649 -4.59210697
  -4.58855613]
 [-4.48303714 -4.27112599 -4.1703935  ... -4.59316649 -4.59210697
  -4.58855613]]
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer_v1.py:1694: UserWarning: `layer.apply` is deprecated and will be removed in a future version. Please use `layer.__call__` method instead.
  warnings.warn('`layer.apply` is deprecated and '
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/legacy_tf_layers/core.py:332: UserWarning: `tf.layers.flatten` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Flatten` instead.
  warnings.warn('`tf.layers.flatten` is deprecated and '
INFO:tensorflow:Restoring parameters from vggish_model.ckpt
VGGish embedding:  [-0.27491876 -0.18145198  0.0535036  -0.14074636 -0.10010143 -0.49875602
 -0.17573859  0.42225772 -0.8224995  -0.2217744  -0.1122438  -0.66742486
  0.14193921 -0.14426535  0.01138712 -0.08733074 -0.18629983  0.5975964
 -0.34129012 -0.06088498 -0.1677085   0.04011205 -0.25935286 -0.24020532
  0.17768645  0.30280727  0.10527742 -0.44472352  0.12082972 -0.30148435
 -0.55684716  0.50770473  0.20743224  0.8840761   0.9006089  -0.21035735
 -0.03214248  0.13514583 -0.22882228  0.1122238   0.59652936 -0.47610348
  0.22845872  0.1544656   0.1654641   0.7211305   1.2400259   0.5628969
  0.27394098  0.02865019  0.210366   -0.6117542  -0.31886205  0.17764254
 -0.08788419 -0.42890146  0.31507537 -0.15670405  0.33398992  0.12844163
  0.16779727  0.03175645 -0.15631482 -0.42661047 -0.26787913 -0.15854102
  0.40249115 -0.25080627 -0.02530902  0.00664307  0.2965419   0.34862638
 -0.10260306  0.08962867  0.12351537 -0.33453903 -0.25491843  0.5126411
  0.3997065   0.17673504 -0.07992037  0.04661459 -0.20129469 -0.29366192
  0.3723822   0.45837146  0.5399544  -0.01809365 -0.06044082  0.41752416
 -0.19358039 -0.5389073  -0.18006337  0.3857254   0.39503643  0.32292122
 -0.04439406 -0.14262633 -0.45319733 -0.10542554 -0.22397852  0.35027808
 -0.25616506  0.3333591  -0.72753274 -0.2566383   0.35339624 -0.31669623
  0.31042522  0.11925487 -0.0173979  -0.40217176 -0.5142518  -0.27508244
 -0.26930487  0.222587    0.1059244   0.130522   -0.12705472 -0.1923312
  0.00311379  0.20225978 -0.1039684   0.03045627 -0.34081817 -0.22932461
 -0.24878944 -0.12488788]
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
[<ipython-input-10-0ade3c97aabd>](https://localhost:8080/#) in <module>()
      1 # Run the test, which also loads all the necessary functions.
----> 2 from vggish_smoke_test import *

2 frames
[/usr/local/lib/python3.7/dist-packages/numpy/testing/_private/utils.py](https://localhost:8080/#) in assert_array_compare(comparison, x, y, err_msg, verbose, header, precision, equal_nan, equal_inf)
    842                                 verbose=verbose, header=header,
    843                                 names=('x', 'y'), precision=precision)
--> 844             raise AssertionError(msg)
    845     except ValueError:
    846         import traceback

AssertionError: 
Not equal to tolerance rtol=0.1, atol=0

Mismatched elements: 1 / 2 (50%)
Max absolute difference: 0.03673259
Max relative difference: 1.01349135
 x: array([0.000449, 0.343267], dtype=float32)
 y: array([-0.0333,  0.38  ])

platform: regular google colab instance (s. link above), same error occurs when I run python vggish_smoke_test.py on my Azure compute instance (Standard_NC6).

when I continue to run the notebook: after some minor fixes (missing imports and variable declaration) it runs w/o throwing errors

plakal commented 2 years ago

From https://github.com/tensorflow/models/issues/9248#issuecomment-1179522997, @jvishnuvardhan ran the smoke test successfully on 9 July 2022 (just a month ago) as per the output in the gist https://colab.sandbox.google.com/gist/jvishnuvardhan/27eba9cbc05fd88ac371f366b553c70b/vggish-audio-embedding-colab.ipynb

I tried re-running it and got an error https://colab.sandbox.google.com/drive/1CT1-VfjWI6oAca2eGjqD7T1xj0QmRFoP

The difference seems to be in the log mel spectrogram computation. It's possible there have been updates to numpy, scipy, etc in the last month that are affecting the precise values of the features. We could try relaxing the tolerance in the smoke test. Currently the differences in (non-post-processed) embedding summaries are actual 0.000449 vs expected -0.0333 for mean (~1% relative) and actual 0.343267 vs expected 0.38 for stddev (~11% relative). The relative error for stddev does seem high.

Darius-H commented 1 year ago

From #9248 (comment), @jvishnuvardhan ran the smoke test successfully on 9 July 2022 (just a month ago) as per the output in the gist https://colab.sandbox.google.com/gist/jvishnuvardhan/27eba9cbc05fd88ac371f366b553c70b/vggish-audio-embedding-colab.ipynb

I tried re-running it and got an error https://colab.sandbox.google.com/drive/1CT1-VfjWI6oAca2eGjqD7T1xj0QmRFoP

The difference seems to be in the log mel spectrogram computation. It's possible there have been updates to numpy, scipy, etc in the last month that are affecting the precise values of the features. We could try relaxing the tolerance in the smoke test. Currently the differences in (non-post-processed) embedding summaries are actual 0.000449 vs expected -0.0333 for mean (~1% relative) and actual 0.343267 vs expected 0.38 for stddev (~11% relative). The relative error for stddev does seem high.

I got the same problem. And my result is the same as this.

HemalathaRamanujam2022 commented 1 year ago

Hi,

I also got the same error. Is there any solution to this?

Paulkie99 commented 1 year ago

Same problem here.

tclewis29 commented 1 year ago

I am getting the same issue running the example Google colab.

chevalierNoir commented 1 year ago

Same issue here.

tclewis29 commented 1 year ago

It would be great to get some guidance on this? Seems others are having the same issue. Could it be down to some compatibility issues or the model checkpoint file?



Log Mel Spectrogram example:  [[-4.48303478 -4.27111968 -4.17038687 ... -4.59048818 -4.56833732
  -4.53160421]
 [-4.4830372  -4.27112577 -4.17039366 ... -4.59317326 -4.59211412
  -4.58856604]
 [-4.4830372  -4.27112577 -4.17039366 ... -4.59317326 -4.59211412
  -4.58856604]
 ...
 [-4.4830372  -4.27112577 -4.17039366 ... -4.59317326 -4.59211412
  -4.58856604]
 [-4.4830372  -4.27112577 -4.17039366 ... -4.59317326 -4.59211412
  -4.58856604]
 [-4.4830372  -4.27112577 -4.17039366 ... -4.59317326 -4.59211412
  -4.58856604]]
/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/base_layer_v1.py:1694: UserWarning: `layer.apply` is deprecated and will be removed in a future version. Please use `layer.__call__` method instead.
  warnings.warn('`layer.apply` is deprecated and '
/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/legacy_tf_layers/core.py:332: UserWarning: `tf.layers.flatten` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Flatten` instead.
  warnings.warn('`tf.layers.flatten` is deprecated and '
VGGish embedding:  [-0.2749162  -0.18145064  0.05350201 -0.14074847 -0.10010482 -0.4987534
 -0.17573784  0.4222586  -0.8224983  -0.22177017 -0.11224633 -0.6674257
  0.14194113 -0.14426494  0.01138521 -0.08732964 -0.18629846  0.59759563
 -0.34129012 -0.06088188 -0.1677068   0.0401158  -0.25934938 -0.24020132
  0.17768957  0.30280647  0.10528167 -0.44472587  0.12083055 -0.30148232
 -0.55684674  0.5077039   0.2074306   0.88407654  0.9006107  -0.21035789
 -0.03214258  0.13514815 -0.22882096  0.11222668  0.5965284  -0.47610325
  0.22845702  0.15446544  0.16546398  0.72113     1.2400258   0.5628949
  0.27394104  0.028651    0.21036614 -0.6117511  -0.31885916  0.17764187
 -0.08788496 -0.42890045  0.3150736  -0.1567034   0.33398592  0.12844217
  0.16779633  0.03175677 -0.15631175 -0.42661184 -0.26788002 -0.15854067
  0.4024899  -0.25080925 -0.02531082  0.00664489  0.29654282  0.3486275
 -0.10260604  0.08962876  0.12351848 -0.3345395  -0.25491843  0.51263684
  0.3997049   0.17673685 -0.07992119  0.04661596 -0.2012937  -0.29366407
  0.3723788   0.45836967  0.53995204 -0.01809558 -0.06044187  0.41752285
 -0.19357675 -0.53890526 -0.18006015  0.38572246  0.395039    0.32292023
 -0.044392   -0.14262386 -0.45319805 -0.10542658 -0.22397754  0.35027942
 -0.25616768  0.33335587 -0.72753346 -0.25663686  0.35339847 -0.3166949
  0.31042632  0.11925843 -0.0173996  -0.40216926 -0.5142505  -0.2750821
 -0.26930374  0.22258827  0.10592782  0.13052128 -0.12705311 -0.1923324
  0.00311245  0.20226285 -0.10396719  0.03045544 -0.3408158  -0.22932032
 -0.24878775 -0.12488335]
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
[<ipython-input-10-0ade3c97aabd>](https://localhost:8080/#) in <module>
      1 # Run the test, which also loads all the necessary functions.
----> 2 from vggish_smoke_test import *

2 frames
[/content/vggish_smoke_test.py](https://localhost:8080/#) in <module>
     79   expected_embedding_mean = -0.0333
     80   expected_embedding_std = 0.380
---> 81   np.testing.assert_allclose(
     82       [np.mean(embedding_batch), np.std(embedding_batch)],
     83       [expected_embedding_mean, expected_embedding_std],

[/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py](https://localhost:8080/#) in assert_allclose(actual, desired, rtol, atol, equal_nan, err_msg, verbose)
   1528     actual, desired = np.asanyarray(actual), np.asanyarray(desired)
   1529     header = f'Not equal to tolerance rtol={rtol:g}, atol={atol:g}'
-> 1530     assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
   1531                          verbose=verbose, header=header, equal_nan=equal_nan)
   1532 

[/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py](https://localhost:8080/#) in assert_array_compare(comparison, x, y, err_msg, verbose, header, precision, equal_nan, equal_inf)
    842                                 verbose=verbose, header=header,
    843                                 names=('x', 'y'), precision=precision)
--> 844             raise AssertionError(msg)
    845     except ValueError:
    846         import traceback

AssertionError: 
Not equal to tolerance rtol=0.1, atol=0

Mismatched elements: 1 / 2 (50%)
Max absolute difference: 0.03673304
Max relative difference: 1.01350297
 x: array([0.00045 , 0.343267], dtype=float32)
 y: array([-0.0333,  0.38  ])```

JakeNewmanUEA commented 1 year ago

Same here. Even having versions of numpy etc that don't give this error would be useful.

youthHan commented 1 year ago

From #9248 (comment), @jvishnuvardhan ran the smoke test successfully on 9 July 2022 (just a month ago) as per the output in the gist https://colab.sandbox.google.com/gist/jvishnuvardhan/27eba9cbc05fd88ac371f366b553c70b/vggish-audio-embedding-colab.ipynb

I tried re-running it and got an error https://colab.sandbox.google.com/drive/1CT1-VfjWI6oAca2eGjqD7T1xj0QmRFoP

The difference seems to be in the log mel spectrogram computation. It's possible there have been updates to numpy, scipy, etc in the last month that are affecting the precise values of the features. We could try relaxing the tolerance in the smoke test. Currently the differences in (non-post-processed) embedding summaries are actual 0.000449 vs expected -0.0333 for mean (~1% relative) and actual 0.343267 vs expected 0.38 for stddev (~11% relative). The relative error for stddev does seem high.

Yes, the version of the installed packages seem to be the cause. I encounter the same error and reinstall the package following the gist from @jvishnuvardhan . Finally, the magic message!

Looks Good To Me!

The install script: pip install numpy==1.21.6 resampy==0.2.2 tensorflow==2.8.2 tf_slim==1.1.0 six soundfile

JakeNewmanUEA commented 1 year ago

I can confirm that the current MATLAB implementation of VGGISH produces identical log mel spectrogram output as my Python version, which is different from the version in the original colab tutorial. Seems most likely that something was fixed in one of the Python dependencies that @youthHan has highlighted, and this accounts for the differences in the expected smoketest output.

plakal commented 1 year ago

Apologies for the lack of updates on this issue!

We have not changed the model in a long time and the model relies on TF 1 which has also only seen maintenance updates recently, so I don't think this is an actual issue with the model per se. I expect that the problem is due to our input pipeline that runs outside of TF using SciPy, Resampy and NumPy to read a wav file, resample, and compute log mel spectrogram. Some combination of updates to these frameworks have made the computed features drift in a way that makes the expected mean/std-dev of the embeddings in the test stale. But even if the test fails, it is just a sanity check and not an indicator that the model is broken. You should still able to use the model for your own tasks.

@youthHan thank you for identifying the versions of the dependences that makes the test pass! I will update the test and the notebook and docs to mention this issue.

When I get some time, I might try a bisection search to find what version of our dependences caused the problem. But for the time being, please ignore the issue or use the exact versions listed above.

lostanlen commented 1 year ago

Same issue here. It would be nice to run VGGish from latest NumPy without risk.

plakal commented 1 year ago

Some good news. After experimenting with several combinations of versions of dependences, I narrowed down the problem to a change in our resampling library resampy from 0.2.2 to 0.3.0. Apparently the small changes to the default resampling filters in https://github.com/bmcfee/resampy/pull/98 changed the waveform just enough to make the VGGish embedding output drift more than our smoke test tolerance of 10% difference in mean and stddev.

I have prepared a fix to the smoke test that removes the resampling altogether and verified that the feature calculation and model forward pass now execute with unchanged outputs when I downgrade or upgrade versions of dependences. And to repeat what I said earlier, the smoke test is a quick and dirty test of the installation and that we can load and run the model. VGGish itself is not affected in any way. We expect that VGGish embeddings, when used in a calibrated classifier, continue to work as expected.

For the curious, @dpwe has done some further analysis of the differences in resampled waveform in a Colab notebook https://colab.research.google.com/drive/1Xowi8D8quNSxo0LrPAegNDks-BKEyj0B

google-ml-butler[bot] commented 1 year ago

Are you satisfied with the resolution of your issue? Yes No

tensorflow / models

VGGish smoke test failure with embedding mismatch #9248