axinc-ai / ailia-models

The collection of pre-trained, state-of-the-art AI models for ailia SDK
1.99k stars 317 forks source link

Torch audio error on crnn_audio_classification #640

Open kyakuno opened 2 years ago

kyakuno commented 2 years ago

--ailia_audioだと動作するが、torch audioを使うとエラーが出る。torch audioに仕様変更が入った加納史枝っがある。

C:\Users\kyakuno\Desktop\ailia-models\audio_processing\crnn_audio_classification>python crnn_audio_classification.py
 INFO utils.py (13) : Start!
 INFO utils.py (163) : env_id: 3
 INFO utils.py (166) : cuDNN-NVIDIA GeForce RTX 3080 (8.6, FP32)
 INFO model_utils.py (67) : ONNX file and Prototxt file are prepared!
 INFO crnn_audio_classification.py (90) : ================================================================================
 INFO crnn_audio_classification.py (91) : input: 24965__www-bonson-ca__bigdogbarking-02.wav
 INFO crnn_audio_classification.py (98) : Start inference...
C:\Users\kyakuno\AppData\Local\Programs\Python\Python39\lib\site-packages\torchaudio\transforms.py:917: UserWarning: torchaudio.transforms.ComplexNorm has been deprecated and will be removed from future release.Please convert the input Tensor to complex type with `torch.view_as_complex` then use `torch.abs` and `torch.angle`. Please refer to https://github.com/pytorch/audio/issues/1337 for more details about torchaudio's plan to migrate to native complex type.
  warnings.warn(
C:\Users\kyakuno\AppData\Local\Programs\Python\Python39\lib\site-packages\torchaudio\transforms.py:936: UserWarning: torchaudio.functional.functional.complex_norm has been deprecated and will be removed from 0.11 release. Please convert the input Tensor to complex type with `torch.view_as_complex` then use `torch.abs`. Please refer to https://github.com/pytorch/audio/issues/1337 for more details about torchaudio's plan to migrate to native complex type.
  return F.complex_norm(complex_tensor, self.power)
Traceback (most recent call last):
  File "C:\Users\kyakuno\Desktop\ailia-models\audio_processing\crnn_audio_classification\crnn_audio_classification.py", line 116, in <module>
    main()
  File "C:\Users\kyakuno\Desktop\ailia-models\audio_processing\crnn_audio_classification\crnn_audio_classification.py", line 107, in main
    label, conf = crnn(data, session)
  File "C:\Users\kyakuno\Desktop\ailia-models\audio_processing\crnn_audio_classification\crnn_audio_classification.py", line 72, in crnn
    xt, lengths = spec.forward(data)
  File "C:\Users\kyakuno\Desktop\ailia-models\audio_processing\crnn_audio_classification\crnn_audio_classification_util.py", line 51, in forward
    x = self.mst.mel_scale(x)
  File "C:\Users\kyakuno\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\kyakuno\AppData\Local\Programs\Python\Python39\lib\site-packages\torchaudio\transforms.py", line 386, in forward
    mel_specgram = torch.matmul(specgram.transpose(-1, -2), self.fb).transpose(-1, -2)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1025x1 and 1025x128)
kyakuno commented 2 years ago

torch.audioの内部でのエラーなので、対処がむつかしい。