Open L0g4n opened 4 years ago
Ok, so I found out the above error that WAVES gave me disappeared when I changed the microphone array to be of size 6 instead of 4. Does that mean that the other algorithms pose some constraints on the (circular) microphone array size?
BTW: To be able to use the DOA algorithms for real world data that is sampled every 50ms from a microphone array for example, the number of sources (you need to pass to the DOA algorithms) is of course not known in advance. Does that mean that i have either to restrict myself by hardcoding the number of sources to compute the locations for in advance or there is a separate source counting algorithm before the whole process needed @fakufaku (e.g. that returns the number of sources that is passed to the algorithms as a paramter)?
If I am not mistaken, you are trying to recover 4 sources. I believe that WAVES, as well as MUSIC, have the limitation that they can only recover up to n_mics - 1
sources.
You mentioned you did not have that problem with MUSIC, that is a little bit surprising...
As to your other question regarding the number of sources, there are indeed a few ways to estimate it, however, I am not very familiar with them. You may try to look up Akaike Information Criterion for example. In practice, another possibility is to try to estimate as many sources as possible and then work out the ones that are real by comparing their power.
Interesting, I did not know that WAVES & MUSIC have this limitation, I need to further investigate about this aspect and see if i can reproduce this limitation. Do only these two have this limitation or are the other algorithms also affected?
Thanks, I will try to look into the Information Criterion thing, about the other method: So, basically the approach for the power method is to compute the (normalized) power of all signals and then just assume the ones are real that are above a certain threshold, or what else should be the criterion to dectect where the power value seems reasonable?
All algorithms that are based on subspace decomposition have it. These algorithms compute the spatial covariance matrix that is n_mics x n_mics
in dimension. Then, they use an eigenvalue decomposition to extract signal and noise subspaces. The dimensions of the two subspaces need to add to n_mics
, and at least one dimension is needed for the noise. In addition, the signal subspace is expected to have the same number of dimensions as there are sources. Putting all this together means that you can have at most n_mics - 1
sources.
There should be a check in the DOA constructor. I will rename the issue and try to add this soon.
Ok, thanks for that information. BTW: Is this desired behaviour of the FRIDA algorithm not to round the reconstructed DOA to the nearest integer (since all other algorithms do exactly that). Here, the results for two (ultrasound sources, both noisefree sines waves at 20, 21kHz, respectively), freq range of interest from 20-24kHz (inter-microphone spacing of 3cm):
CSSM Recovered azimuth: [ 30. 120.] degrees Real azimuth: [ 30. 120.] degrees FRIDA Recovered azimuth: [ 29.56244176 120.00692398] degrees Real azimuth: [ 30. 120.] degrees MUSIC Recovered azimuth: [ 30. 120.] degrees Real azimuth: [ 30. 120.] degrees SRP Recovered azimuth: [ 31. 100.] degrees Real azimuth: [ 30. 120.] degrees TOPS Recovered azimuth: [254. 347.] degrees Real azimuth: [ 30. 120.] degrees WAVES Recovered azimuth: [ 30. 120.] degrees Real azimuth: [ 30. 120.] degrees
It is also interesting to note that for naive ultrasonic sources all the subspace decomposition algorithms seem to do much better than for example SRP & TOPS, with speech samples SRP-PHAT gave me the best results.
For FRIDA I also get always a lot of "ill-conditioned matrix, results may not be accurate" warnings:
C:\tools\Anaconda3\lib\site-packages\pyroomacoustics\doa\tools_fri_doa_plane.py:855: LinAlgWarning: Ill-conditioned matrix (rcond=8.85896e-23): result may not be accurate. c_ri_half = linalg.solve(mtx_loop, rhs, check_finite=False)[:sz_coef] C:\tools\Anaconda3\lib\site-packages\pyroomacoustics\doa\tools_fri_doa_plane.py:855: LinAlgWarning: Ill-conditioned matrix (rcond=8.85898e-23): result may not be accurate. c_ri_half = linalg.solve(mtx_loop, rhs, check_finite=False)[:sz_coef
Yes, this is the expected behavior. Except for FRIDA, all algorithms rely on search. You provide a list of possible locations and they tell you which one is most likely one. Since you provide only integer candidate locations, the result is an integer. There is no rounding. On the other hand, FRIDA computes the locations directly without search, which results in a decimal number generally.
The ill-conditioned matrix is a known warning for FRIDA.
Hi,
sorry to bother, but I was trying to evaluate the other algorithms (besides SRP & MUSIC which work with my setup), but then I tried to run the remaining algorithms over my setup and encountered crashing behaviour, in the case of only adding the WAVES algorithm, I get this traceback:
which indicates that
wn
(the eigenvalue for the noise subspace is empty according to the soure code). I suspect that this is the reason why CSSM also fails with my setup. This is my setup code (basically the same as before)