Sankalpsp21 / SoundConductor

2023 Atlas Madness Hackathon Project
https://sound-conductor-4bsnr75vpq-uc.a.run.app/
3 stars 1 forks source link

Create an audio processing neural network #1

Open tobyloki opened 1 year ago

derek-williams00 commented 1 year ago

YAMNet Model: YAMNet is an audio event classifier model that uses the MobileNet v1 architecture. It can make independent predictions for each of 521 audio events from the AudioSet ontology. The model accepts a 1-D float32 Tensor or NumPy array of length 15600 containing a 0.975 second waveform represented as mono 16 kHz samples. The model returns a 2-D float32 Tensor of shape (1, 521) containing the predicted scores for each of the 521 classes supported by YAMNet. The column index (0-520) of the scores tensor is mapped to the corresponding AudioSet class name using the YAMNet Class Map.

Customization: To classify sounds not included in YAMNet's 521 classes, we can use a technique known as transfer learning to re-train the model to recognize other classes. We will need a set of training audios for each of the new labels you wish to train. The recommended way to do this is to use TensorFlow Lite Model Maker.

Links to resources: https://www.tensorflow.org/lite/examples/audio_classification/overview https://github.com/swittk/react-native-tensorflow-lite https://www.tensorflow.org/tutorials/audio/simple_audio https://www.tensorflow.org/tutorials/audio/transfer_learning_audio https://www.youtube.com/watch?v=ZLIPkmmDJAc https://github.com/nicknochnack/DeepAudioClassification/blob/main/AudioClassification.ipynb