JVass / background_sensitivity_of_CNNs

This repository will contain the assignment for the module of "Deep Learning for Audio and Music". The objective is to implement two Deep Neural Network architectures, for two different kind of tasks and try to combine those inferences. I chose to implement a U-Net for Spectrogram denoising and a DenseNet for Environmental Sound Classification.
0 stars 0 forks source link

Environmental Sound Classification Setup #2

Closed JVass closed 1 year ago

JVass commented 1 year ago

This will be the second section of the assignment that is: environmental sound classification with the use of a DenseNet

JVass commented 1 year ago

https://pytorch.org/hub/pytorch_vision_densenet/ says input image has to be:

  1. Normalized with mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]
  2. Height and Width at least 224

Palanisamy et al didn't do any of those things, where HxW for UrbanSound8k is (128,250) and not normalized.

JVass commented 1 year ago

Global params for classification will be based on Palanisamy et al: No Early Stopping (I discarded the learning rate scheduler) EPOCHS = 70 LR = 1e-4

JVass commented 1 year ago

The results are not very promising, but it is as good it will get for the assignment.