Open letnnn opened 3 months ago
Hello, I used two audio recordings for positioning, both played from the same location. For the first 20 sets of data, I used the first audio, and for the last 10 sets, I used the second audio. However, the positioning results are not the same, and the results from the second audio are much worse. Is this an issue with the audio, or could there be other possible reasons? Please help me with this issue. Thank you!
Are the two results in the same location and environment? For example, the first results has lower environmental noise than the second one, etc. It would be best if you can provide audio files. At present, the experimental configuration and environment description are not obvious.
Both experiments were conducted in the same environment and at the same location. The only difference is the type of audio used. My first audio is relatively clean, but the second audio contains some noise. I suspect that the lack of purity in the second audio might be the cause of the issue. What do you think?
I can't send the source audio files, but below are images of the two audio recordings. The first image is from the first audio, and the second image is from the second audio.
I used DUET to separate each source and then passed it to the SRP-PHAT algorithm for multi-source tracking. I thought the attenuation and delay of the first audio after DUET will be better than second one, because it has a stable sound output. The second audio can also be separated, but the attenuation and delay will be slightly affected because DUET does not have the ability to resist noise. Attenuation and delay will affect the precision of SRP-PHAT.
Would it improve the results if we process the audio collected by the microphone array, such as by applying noise reduction, before performing the positioning?
Absolutely. DUET counts the attenuation and delay of each TF-point to calculate the mask of the sound source. The premise of this algorithm is that each TF-point only contributes from a single sound source (in other words, ignores environmental noise).
If the SNR is too low, the peak may be misestimated, resulting in attenuation and delay masks to deviate, finally affecting the SRP-PHAT.
Checked other resources
Issue with current documentation
The results of my audio localization are not satisfactory.
Idea or request for content
The positioning was performed using two audio recordings, but the positioning result of one of them deviates significantly from the actual location. I'm not sure what the reason is and hope you can help me figure it out.
Further Information