Closed AvioGG closed 1 year ago
Hi @AvioGG , that's a great question.
Thus, in practice, the real valued input signal is first transformed to the STFT domain, where the narrow band assumption can be used in each of the subbands. Then the DOA algorithm (MUSIC, SRP, etc) is run per-freq. band, and the results are combined.
For a pretty good reference about the practical implementations of these algorithms, you can see I. Tashev, "Sound Capture and Processing", 2009.
Thanks, The implementation makes much more sense now. Also, huge thanks for the book you mentioned. Chapter 5 and 6 are indeed what I was looking for in terms of implementation and general theoretical information.
Fantastic! Happy to help! Please close the issue if that solved your problem! 😄
Hi fakufaku, Firstly, thanks for the great library.
I am currently using the multiple signal classification (MUSIC) algorithm for DoA estimation and it works pretty well for both narrowband signals and even for wideband signals.
However, after looking into the source code, I have few questions regarding the implementation. The code snippet below is taken from doa/music.py file. Here the main issue I have is the tensor denoted by X, which is defined as the STFT of the input signal with shape (M x n_fft/2+1 x num_frames), with M being the number of microphones.
Why do we need this tensor X in the first place and why do we take the autocorrelation of STFT frames?
The MUSIC algorithm provided by the original paper (and anywhere else as far as I know) does not use any frequency representation as input. It only calculates the autocorrelation matrix in time domain and uses the steering vector which is known due to the geometry of the setup then follows it by eigenvalue decomposition.
After looking into various sources related to MUSIC in Wikipedia, and the original paper, I see no FFT operations or spectrogram calculations. In addition, I have also implemented the MUSIC algorithm for a uniform linear array using the description from the paper and it also works properly.
So my questions are: