Shubhabrata08 / AudioClassificationTFLite

This project aims to classify UrbanSound8K audio and deploy the model to a microcontroller
0 stars 0 forks source link

Continuous audio recording and MFCC generation #1

Open Shubhabrata08 opened 8 months ago

Shubhabrata08 commented 8 months ago

Details:

Create a python script to continuously record live audio and generate an MFCC for every t seconds interval

Tools and Libraries recommended:

Librosa and Python

sa-paul commented 8 months ago

1. Completed 5 days DL course by Krish Naik :

Deep Learning LINK

Problem Faced: 1. Code is not understood

2. Learning MFCC Generation:

Completed: 1. Created github branch named sayan 2. Learned audio recording in python 3. Using chatgpt and some youtube videos, coded as required. 4. Push to sayan branch

Problem Faced: 1. sounddevice and soundfile libraries are not working in both Ubuntu 22.04 (local) and google colab

Date: 12-01-2024 to 23-01-2024

sa-paul commented 8 months ago

Link to github branch sayan: sayan

Shubhabrata08 commented 8 months ago

Create a draft PR for this @sa-paul Also, add snippets of errors if possible. Add screenshots of the outputs to the PR description

sa-paul commented 8 months ago

I have drafted pull request with no #4 PR

sa-paul commented 8 months ago

Task Assigned:

Run the mfccGen.py code in local machine which can run audio driver for sound recording.

Trying to run that with initial form of the code ( not using pyAudio library ):


import sounddevice as sd
import soundfile as sf
import numpy as np
import librosa.display
import matplotlib.pyplot as plt

def record_audio(sample_rate, duration):
    audio_data = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, dtype='int16')
    sd.wait()
    return audio_data.flatten()

def extract_mfcc(audio_data, sample_rate, n_mfcc=13, hop_length=512):
    mfccs = librosa.feature.mfcc(y=audio_data, sr=sample_rate, n_mfcc=n_mfcc, hop_length=hop_length)
    return mfccs

# Record a short audio clip (adjust sample_rate and duration as needed)
sample_rate = 44100
duration = 5
audio_data = record_audio(sample_rate, duration)

# Extract MFCC features
mfccs = extract_mfcc(audio_data, sample_rate)

# Display MFCC features
plt.figure(figsize=(10, 4))
librosa.display.specshow(mfccs, x_axis='time')
plt.colorbar()
plt.title('MFCC')
plt.show()

Downloaded dependent libraries in local system ( mac M1 ):

/usr/bin/python3 /Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen.py
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % /usr/bin/python3 /Users/sayanpaul/Desktop/dev24/f
inalYearProject/AudioClassificationTFLite/mfccGen.py
Traceback (most recent call last):
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen.py", line 2, in <module>
    import sounddevice as sd
ModuleNotFoundError: No module named 'sounddevice'
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % pip --version
zsh: command not found: pip
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python --version
zsh: command not found: python
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 --version
Python 3.9.6
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 mfccGen.py 
Traceback (most recent call last):
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen.py", line 2, in <module>
    import sounddevice as sd
ModuleNotFoundError: No module named 'sounddevice'
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % pip install sounddeivce 
zsh: command not found: pip
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python get-pip.py
zsh: command not found: python
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 get-pip.py
/Library/Developer/CommandLineTools/usr/bin/python3: can't open file '/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/get-pip.py': [Errno 2] No such file or directory
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % curl -sSL https://bootstrap.pypa.io/get-pip.py -o get-pip.py
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 get-pip.py 
Defaulting to user installation because normal site-packages is not writeable
Collecting pip
  Downloading pip-24.0-py3-none-any.whl.metadata (3.6 kB)
Downloading pip-24.0-py3-none-any.whl (2.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 1.9 MB/s eta 0:00:00
Installing collected packages: pip
  WARNING: The scripts pip, pip3 and pip3.9 are installed in '/Users/sayanpaul/Library/Python/3.9/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-24.0

[notice] A new release of pip is available: 21.2.4 -> 24.0
[notice] To update, run: /Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % pip
zsh: command not found: pip
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % pip --version
zsh: command not found: pip
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 -m pip install
Defaulting to user installation because normal site-packages is not writeable
ERROR: You must give at least one requirement to install (see "pip help install")
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 -m pip install sounddevice
Defaulting to user installation because normal site-packages is not writeable
Collecting sounddevice
  Downloading sounddevice-0.4.6-py3-none-macosx_10_6_x86_64.macosx_10_6_universal2.whl (107 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 108.0/108.0 kB 1.7 MB/s eta 0:00:00
Collecting CFFI>=1.0 (from sounddevice)
  Downloading cffi-1.16.0-cp39-cp39-macosx_11_0_arm64.whl.metadata (1.5 kB)
Collecting pycparser (from CFFI>=1.0->sounddevice)
  Downloading pycparser-2.21-py2.py3-none-any.whl (118 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 118.7/118.7 kB 5.8 MB/s eta 0:00:00
Downloading cffi-1.16.0-cp39-cp39-macosx_11_0_arm64.whl (176 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 176.8/176.8 kB 11.8 MB/s eta 0:00:00
Installing collected packages: pycparser, CFFI, sounddevice
Successfully installed CFFI-1.16.0 pycparser-2.21 sounddevice-0.4.6
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % pip --version
zsh: command not found: pip
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 -m pip --version
pip 24.0 from /Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/pip (python 3.9)
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 -m pip install soundfile
Defaulting to user installation because normal site-packages is not writeable
Collecting soundfile
  Downloading soundfile-0.12.1-py2.py3-none-macosx_11_0_arm64.whl (1.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 2.2 MB/s eta 0:00:00
Requirement already satisfied: cffi>=1.0 in /Users/sayanpaul/Library/Python/3.9/lib/python/site-packages (from soundfile) (1.16.0)
Requirement already satisfied: pycparser in /Users/sayanpaul/Library/Python/3.9/lib/python/site-packages (from cffi>=1.0->soundfile) (2.21)
Installing collected packages: soundfile
Successfully installed soundfile-0.12.1
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 -m pip install numpy
Defaulting to user installation because normal site-packages is not writeable
Collecting numpy
  Downloading numpy-1.26.4-cp39-cp39-macosx_11_0_arm64.whl.metadata (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.1/61.1 kB 1.0 MB/s eta 0:00:00
Downloading numpy-1.26.4-cp39-cp39-macosx_11_0_arm64.whl (14.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.0/14.0 MB 1.8 MB/s eta 0:00:00
Installing collected packages: numpy
  WARNING: The script f2py is installed in '/Users/sayanpaul/Library/Python/3.9/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed numpy-1.26.4
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 -m pip install matplotlib
Defaulting to user installation because normal site-packages is not writeable
Collecting matplotlib
  Downloading matplotlib-3.8.2-cp39-cp39-macosx_11_0_arm64.whl.metadata (5.8 kB)
Collecting contourpy>=1.0.1 (from matplotlib)
  Downloading contourpy-1.2.0-cp39-cp39-macosx_11_0_arm64.whl.metadata (5.8 kB)
Collecting cycler>=0.10 (from matplotlib)
  Downloading cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib)
  Downloading fonttools-4.48.1-cp39-cp39-macosx_10_9_universal2.whl.metadata (158 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 158.9/158.9 kB 2.2 MB/s eta 0:00:00
Collecting kiwisolver>=1.3.1 (from matplotlib)
  Downloading kiwisolver-1.4.5-cp39-cp39-macosx_11_0_arm64.whl.metadata (6.4 kB)
Requirement already satisfied: numpy<2,>=1.21 in /Users/sayanpaul/Library/Python/3.9/lib/python/site-packages (from matplotlib) (1.26.4)
Collecting packaging>=20.0 (from matplotlib)
  Downloading packaging-23.2-py3-none-any.whl.metadata (3.2 kB)
Collecting pillow>=8 (from matplotlib)
  Downloading pillow-10.2.0-cp39-cp39-macosx_11_0_arm64.whl.metadata (9.7 kB)
Collecting pyparsing>=2.3.1 (from matplotlib)
  Downloading pyparsing-3.1.1-py3-none-any.whl.metadata (5.1 kB)
Collecting python-dateutil>=2.7 (from matplotlib)
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 247.7/247.7 kB 5.2 MB/s eta 0:00:00
Collecting importlib-resources>=3.2.0 (from matplotlib)
  Downloading importlib_resources-6.1.1-py3-none-any.whl.metadata (4.1 kB)
Collecting zipp>=3.1.0 (from importlib-resources>=3.2.0->matplotlib)
  Downloading zipp-3.17.0-py3-none-any.whl.metadata (3.7 kB)
Requirement already satisfied: six>=1.5 in /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/site-packages (from python-dateutil>=2.7->matplotlib) (1.15.0)
Downloading matplotlib-3.8.2-cp39-cp39-macosx_11_0_arm64.whl (7.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.5/7.5 MB 1.8 MB/s eta 0:00:00
Downloading contourpy-1.2.0-cp39-cp39-macosx_11_0_arm64.whl (242 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 242.4/242.4 kB 2.1 MB/s eta 0:00:00
Downloading cycler-0.12.1-py3-none-any.whl (8.3 kB)
Downloading fonttools-4.48.1-cp39-cp39-macosx_10_9_universal2.whl (2.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.8/2.8 MB 1.9 MB/s eta 0:00:00
Downloading importlib_resources-6.1.1-py3-none-any.whl (33 kB)
Downloading kiwisolver-1.4.5-cp39-cp39-macosx_11_0_arm64.whl (66 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.2/66.2 kB 1.1 MB/s eta 0:00:00
Downloading packaging-23.2-py3-none-any.whl (53 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.0/53.0 kB 1.5 MB/s eta 0:00:00
Downloading pillow-10.2.0-cp39-cp39-macosx_11_0_arm64.whl (3.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/3.3 MB 1.9 MB/s eta 0:00:00
Downloading pyparsing-3.1.1-py3-none-any.whl (103 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.1/103.1 kB 1.5 MB/s eta 0:00:00
Downloading zipp-3.17.0-py3-none-any.whl (7.4 kB)
Installing collected packages: zipp, python-dateutil, pyparsing, pillow, packaging, kiwisolver, fonttools, cycler, contourpy, importlib-resources, matplotlib
  WARNING: The scripts fonttools, pyftmerge, pyftsubset and ttx are installed in '/Users/sayanpaul/Library/Python/3.9/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed contourpy-1.2.0 cycler-0.12.1 fonttools-4.48.1 importlib-resources-6.1.1 kiwisolver-1.4.5 matplotlib-3.8.2 packaging-23.2 pillow-10.2.0 pyparsing-3.1.1 python-dateutil-2.8.2 zipp-3.17.0
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 -m pip install librosa   
Defaulting to user installation because normal site-packages is not writeable
Collecting librosa
  Downloading librosa-0.10.1-py3-none-any.whl.metadata (8.3 kB)
Collecting audioread>=2.1.9 (from librosa)
  Downloading audioread-3.0.1-py3-none-any.whl.metadata (8.4 kB)
Requirement already satisfied: numpy!=1.22.0,!=1.22.1,!=1.22.2,>=1.20.3 in /Users/sayanpaul/Library/Python/3.9/lib/python/site-packages (from librosa) (1.26.4)
Collecting scipy>=1.2.0 (from librosa)
  Downloading scipy-1.12.0-cp39-cp39-macosx_12_0_arm64.whl.metadata (60 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.4/60.4 kB 1.1 MB/s eta 0:00:00
Collecting scikit-learn>=0.20.0 (from librosa)
  Downloading scikit_learn-1.4.0-1-cp39-cp39-macosx_12_0_arm64.whl.metadata (11 kB)
Collecting joblib>=0.14 (from librosa)
  Downloading joblib-1.3.2-py3-none-any.whl.metadata (5.4 kB)
Collecting decorator>=4.3.0 (from librosa)
  Downloading decorator-5.1.1-py3-none-any.whl (9.1 kB)
Collecting numba>=0.51.0 (from librosa)
  Downloading numba-0.59.0-cp39-cp39-macosx_11_0_arm64.whl.metadata (2.7 kB)
Requirement already satisfied: soundfile>=0.12.1 in /Users/sayanpaul/Library/Python/3.9/lib/python/site-packages (from librosa) (0.12.1)
Collecting pooch>=1.0 (from librosa)
  Downloading pooch-1.8.0-py3-none-any.whl.metadata (9.9 kB)
Collecting soxr>=0.3.2 (from librosa)
  Downloading soxr-0.3.7-cp39-cp39-macosx_11_0_arm64.whl.metadata (5.5 kB)
Collecting typing-extensions>=4.1.1 (from librosa)
  Downloading typing_extensions-4.9.0-py3-none-any.whl.metadata (3.0 kB)
Collecting lazy-loader>=0.1 (from librosa)
  Downloading lazy_loader-0.3-py3-none-any.whl.metadata (4.3 kB)
Collecting msgpack>=1.0 (from librosa)
  Downloading msgpack-1.0.7-cp39-cp39-macosx_11_0_arm64.whl.metadata (9.1 kB)
Collecting llvmlite<0.43,>=0.42.0dev0 (from numba>=0.51.0->librosa)
  Downloading llvmlite-0.42.0-cp39-cp39-macosx_11_0_arm64.whl.metadata (4.8 kB)
Collecting platformdirs>=2.5.0 (from pooch>=1.0->librosa)
  Downloading platformdirs-4.2.0-py3-none-any.whl.metadata (11 kB)
Requirement already satisfied: packaging>=20.0 in /Users/sayanpaul/Library/Python/3.9/lib/python/site-packages (from pooch>=1.0->librosa) (23.2)
Collecting requests>=2.19.0 (from pooch>=1.0->librosa)
  Downloading requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting threadpoolctl>=2.0.0 (from scikit-learn>=0.20.0->librosa)
  Downloading threadpoolctl-3.2.0-py3-none-any.whl.metadata (10.0 kB)
Requirement already satisfied: cffi>=1.0 in /Users/sayanpaul/Library/Python/3.9/lib/python/site-packages (from soundfile>=0.12.1->librosa) (1.16.0)
Requirement already satisfied: pycparser in /Users/sayanpaul/Library/Python/3.9/lib/python/site-packages (from cffi>=1.0->soundfile>=0.12.1->librosa) (2.21)
Collecting charset-normalizer<4,>=2 (from requests>=2.19.0->pooch>=1.0->librosa)
  Downloading charset_normalizer-3.3.2-cp39-cp39-macosx_11_0_arm64.whl.metadata (33 kB)
Collecting idna<4,>=2.5 (from requests>=2.19.0->pooch>=1.0->librosa)
  Downloading idna-3.6-py3-none-any.whl.metadata (9.9 kB)
Collecting urllib3<3,>=1.21.1 (from requests>=2.19.0->pooch>=1.0->librosa)
  Downloading urllib3-2.2.0-py3-none-any.whl.metadata (6.4 kB)
Collecting certifi>=2017.4.17 (from requests>=2.19.0->pooch>=1.0->librosa)
  Downloading certifi-2024.2.2-py3-none-any.whl.metadata (2.2 kB)
Downloading librosa-0.10.1-py3-none-any.whl (253 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 253.7/253.7 kB 4.3 MB/s eta 0:00:00
Downloading audioread-3.0.1-py3-none-any.whl (23 kB)
Downloading joblib-1.3.2-py3-none-any.whl (302 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 302.2/302.2 kB 1.2 MB/s eta 0:00:00
Downloading lazy_loader-0.3-py3-none-any.whl (9.1 kB)
Downloading msgpack-1.0.7-cp39-cp39-macosx_11_0_arm64.whl (232 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 232.1/232.1 kB 9.3 MB/s eta 0:00:00
Downloading numba-0.59.0-cp39-cp39-macosx_11_0_arm64.whl (2.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.6/2.6 MB 2.0 MB/s eta 0:00:00
Downloading pooch-1.8.0-py3-none-any.whl (62 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.7/62.7 kB 1.8 MB/s eta 0:00:00
Downloading scikit_learn-1.4.0-1-cp39-cp39-macosx_12_0_arm64.whl (10.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.7/10.7 MB 1.8 MB/s eta 0:00:00
Downloading scipy-1.12.0-cp39-cp39-macosx_12_0_arm64.whl (31.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 31.4/31.4 MB 1.8 MB/s eta 0:00:00
Downloading soxr-0.3.7-cp39-cp39-macosx_11_0_arm64.whl (390 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 390.1/390.1 kB 1.6 MB/s eta 0:00:00
Downloading typing_extensions-4.9.0-py3-none-any.whl (32 kB)
Downloading llvmlite-0.42.0-cp39-cp39-macosx_11_0_arm64.whl (28.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 28.8/28.8 MB 1.8 MB/s eta 0:00:00
Downloading platformdirs-4.2.0-py3-none-any.whl (17 kB)
Downloading requests-2.31.0-py3-none-any.whl (62 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 1.7 MB/s eta 0:00:00
Downloading threadpoolctl-3.2.0-py3-none-any.whl (15 kB)
Downloading certifi-2024.2.2-py3-none-any.whl (163 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.8/163.8 kB 1.7 MB/s eta 0:00:00
Downloading charset_normalizer-3.3.2-cp39-cp39-macosx_11_0_arm64.whl (120 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 120.4/120.4 kB 1.8 MB/s eta 0:00:00
Downloading idna-3.6-py3-none-any.whl (61 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.6/61.6 kB 2.2 MB/s eta 0:00:00
Downloading urllib3-2.2.0-py3-none-any.whl (120 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 120.9/120.9 kB 1.7 MB/s eta 0:00:00
Installing collected packages: urllib3, typing-extensions, threadpoolctl, soxr, scipy, platformdirs, msgpack, llvmlite, lazy-loader, joblib, idna, decorator, charset-normalizer, certifi, audioread, scikit-learn, requests, numba, pooch, librosa
  WARNING: The script normalizer is installed in '/Users/sayanpaul/Library/Python/3.9/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed audioread-3.0.1 certifi-2024.2.2 charset-normalizer-3.3.2 decorator-5.1.1 idna-3.6 joblib-1.3.2 lazy-loader-0.3 librosa-0.10.1 llvmlite-0.42.0 msgpack-1.0.7 numba-0.59.0 platformdirs-4.2.0 pooch-1.8.0 requests-2.31.0 scikit-learn-1.4.0 scipy-1.12.0 soxr-0.3.7 threadpoolctl-3.2.0 typing-extensions-4.9.0 urllib3-2.2.0
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % 

Errors while running the code on the same place:

sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 mfccGen.py 
Traceback (most recent call last):
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen.py", line 23, in <module>
    mfccs = extract_mfcc(audio_data, sample_rate)
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen.py", line 14, in extract_mfcc
    mfccs = librosa.feature.mfcc(y=audio_data, sr=sample_rate, n_mfcc=n_mfcc, hop_length=hop_length)
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/feature/spectral.py", line 1989, in mfcc
    S = power_to_db(melspectrogram(y=y, sr=sr, **kwargs))
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/feature/spectral.py", line 2130, in melspectrogram
    S, n_fft = _spectrogram(
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/core/spectrum.py", line 2822, in _spectrogram
    stft(
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/core/spectrum.py", line 230, in stft
    util.valid_audio(y, mono=False)
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/util/utils.py", line 298, in valid_audio
    raise ParameterError("Audio data must be floating-point")
librosa.util.exceptions.ParameterError: Audio data must be floating-point
^C
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 mfccGen.py
Traceback (most recent call last):
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen.py", line 23, in <module>
    mfccs = extract_mfcc(audio_data, sample_rate)
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen.py", line 14, in extract_mfcc
    mfccs = librosa.feature.mfcc(y=audio_data, sr=sample_rate, n_mfcc=n_mfcc, hop_length=hop_length)
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/feature/spectral.py", line 1989, in mfcc
    S = power_to_db(melspectrogram(y=y, sr=sr, **kwargs))
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/feature/spectral.py", line 2130, in melspectrogram
    S, n_fft = _spectrogram(
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/core/spectrum.py", line 2822, in _spectrogram
    stft(
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/core/spectrum.py", line 230, in stft
    util.valid_audio(y, mono=False)
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/util/utils.py", line 298, in valid_audio
    raise ParameterError("Audio data must be floating-point")
librosa.util.exceptions.ParameterError: Audio data must be floating-point
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % 
sa-paul commented 8 months ago

Second version of the code:

import sounddevice as sd
import numpy as np
import librosa
import time

def record_audio(sample_rate, duration):
    audio_data = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, dtype='int16')
    sd.wait()
    return audio_data.flatten()

def extract_mfcc(audio_data, sample_rate, n_mfcc=13, hop_length=512):
    mfccs = librosa.feature.mfcc(y=audio_data, sr=sample_rate, n_mfcc=n_mfcc, hop_length=hop_length)
    return mfccs

def main():
    sample_rate = 44100  # Adjust according to your requirements
    duration = 5  # Adjust the duration of each recording
    interval = 2  # Adjust the interval for MFCC generation

    try:
        while True:
            print("Recording...")
            audio_data = record_audio(sample_rate, duration)
            print("Audio recorded.")

            print("Generating MFCC...")
            mfccs = extract_mfcc(audio_data, sample_rate)
            print("MFCC generated.")

            # Do something with the MFCC data, e.g., save to a file, analyze, etc.

            time.sleep(interval)

    except KeyboardInterrupt:
        print("\nRecording stopped.")

if __name__ == "__main__":
    main()

Errors while Testing the 2nd version of the code in same local machine:

sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 mfccGen2.py 
Recording...
Audio recorded.
Generating MFCC...
Traceback (most recent call last):
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen2.py", line 38, in <module>
    main()
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen2.py", line 27, in main
    mfccs = extract_mfcc(audio_data, sample_rate)
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen2.py", line 12, in extract_mfcc
    mfccs = librosa.feature.mfcc(y=audio_data, sr=sample_rate, n_mfcc=n_mfcc, hop_length=hop_length)
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/feature/spectral.py", line 1989, in mfcc
    S = power_to_db(melspectrogram(y=y, sr=sr, **kwargs))
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/feature/spectral.py", line 2130, in melspectrogram
    S, n_fft = _spectrogram(
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/core/spectrum.py", line 2822, in _spectrogram
    stft(
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/core/spectrum.py", line 230, in stft
    util.valid_audio(y, mono=False)
  File "/Users/sayanpaul/Library/Python/3.9/lib/python/site-packages/librosa/util/utils.py", line 298, in valid_audio
    raise ParameterError("Audio data must be floating-point")
librosa.util.exceptions.ParameterError: Audio data must be floating-point
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % 

Conclusion:

  1. The code runs and finds the audio device using respective library.
  2. Recoding has been finished.
  3. The problem is in library syntax librosa.util.exceptions.ParameterError: Audio data must be floating-point

Decision on the changing the code as per error faced (1):

mfccs = librosa.feature.mfcc(y=float(audio_data), sr=sample_rate, n_mfcc=n_mfcc, hop_length=hop_length)

Output for modified code:

sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 mfccGen2.py
Recording...
Audio recorded.
Generating MFCC...
Traceback (most recent call last):
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen2.py", line 38, in <module>
    main()
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen2.py", line 27, in main
    mfccs = extract_mfcc(audio_data, sample_rate)
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen2.py", line 12, in extract_mfcc
    mfccs = librosa.feature.mfcc(y=float(audio_data), sr=sample_rate, n_mfcc=n_mfcc, hop_length=hop_length)
TypeError: only length-1 arrays can be converted to Python scalars
sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % 

Decision on the changing the code as per error faced (2): reference

mfccs = librosa.feature.mfcc(y=float(np.asarray(audio_data))[0,len(audio_data)-1], sr=sample_rate, n_mfcc=n_mfcc, hop_length=hop_length)

Output for modified code:

sayanpaul@Sayans-MacBook-Air AudioClassificationTFLite % python3 mfccGen2.py
Recording...
Audio recorded.
Generating MFCC...
Traceback (most recent call last):
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen2.py", line 38, in <module>
    main()
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen2.py", line 27, in main
    mfccs = extract_mfcc(audio_data, sample_rate)
  File "/Users/sayanpaul/Desktop/dev24/finalYearProject/AudioClassificationTFLite/mfccGen2.py", line 12, in extract_mfcc
    mfccs = librosa.feature.mfcc(y=float(np.asarray(audio_data))[0,len(audio_data)-1], sr=sample_rate, n_mfcc=n_mfcc, hop_length=hop_length)
TypeError: only length-1 arrays can be converted to Python scalars

Decision (2):

  1. ChatGpt: image
  2. Suggested me the initial code which did not work.
  3. The reason for the error is not understood.
    
    TypeError: only length-1 arrays can be converted to Python scalars 
sa-paul commented 8 months ago

Modified Code:


import sounddevice as sd
import soundfile as sf
import numpy as np
import librosa.display
import matplotlib.pyplot as plt

def record_audio(sample_rate, duration):
    audio_data = sd.rec(int(sample_rate * duration), samplerate=sample_rate, channels=1, dtype='float')
    sd.wait()
    return audio_data.flatten()

def extract_mfcc(audio_data, sample_rate, n_mfcc=13, hop_length=512):
    mfccs = librosa.feature.mfcc(y=audio_data, sr=sample_rate, n_mfcc=n_mfcc, hop_length=hop_length)
    return mfccs

# Record a short audio clip (adjust sample_rate and duration as needed)
sample_rate = 44100
duration = 5
audio_data = record_audio(sample_rate, duration)

# Extract MFCC features
mfccs = extract_mfcc(audio_data, sample_rate)

# Display MFCC features
plt.figure(figsize=(10, 4))
librosa.display.specshow(mfccs, x_axis='time')
plt.colorbar()
plt.title('MFCC')
plt.show()

Proper Output after running on local machine:

Recorded surrounding sound only. image

sa-paul commented 8 months ago

I have drafted pull request with no #7 PR