LCAV / pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
https://pyroomacoustics.readthedocs.io
MIT License
1.33k stars 417 forks source link

Testing CNN model using sound generated from pyroomacoustics room simulation #331

Closed kehinde-elelu closed 4 months ago

kehinde-elelu commented 5 months ago

I have generated a large set of audio using Pyroomacoustics for room simulation, employing a circular microphone array and a single sound source.

I have successfully trained and tested a CRNN (Convolutional Recurrent Neural Network) model with this audio dataset to predict events and DOA estimation.

However, when using the trained model to analyze audio from a Respeaker v4 mic-array, the results are unsatisfactory, despite both setups having a similar mic-arrangement in the simulated scenarios.

Can I accurately estimate the Direction of Arrival (DOA) for a circular microphone array with a small radius, especially given that the Respeaker mic-array has microphones spaced less than 0.05cm apart from each other?

I've observed differences in the spectrograms between the WAV files generated from the Pyroomacoustics room simulation and the Respeaker audio. Is it possible to adjust the room simulation parameters to generate audio with a spectrogram more closely resembling that of the Respeaker?

coreeey commented 2 months ago

I have generated a large set of audio using Pyroomacoustics for room simulation, employing a circular microphone array and a single sound source.

I have successfully trained and tested a CRNN (Convolutional Recurrent Neural Network) model with this audio dataset to predict events and DOA estimation.

However, when using the trained model to analyze audio from a Respeaker v4 mic-array, the results are unsatisfactory, despite both setups having a similar mic-arrangement in the simulated scenarios.

Can I accurately estimate the Direction of Arrival (DOA) for a circular microphone array with a small radius, especially given that the Respeaker mic-array has microphones spaced less than 0.05cm apart from each other?

I've observed differences in the spectrograms between the WAV files generated from the Pyroomacoustics room simulation and the Respeaker audio. Is it possible to adjust the room simulation parameters to generate audio with a spectrogram more closely resembling that of the Respeaker?

have you solve this problem?

fakufaku commented 2 months ago

Hello, first of all sorry to @kehinde-elelu as I never replied 🙇

There is not yet a perfect solution to match the simulation to a specific hardware or make it generalize in general. Enabling the randomized image source model (by setting use_rand_ism=True see the doc) will help the model generalize in practice.

However, the simulation will still be missing the response of the microphone array you are using. If you have a way to measure it, you could try to include it in the simulation.