I am working on sound source localization. I have read your papers entitled: "Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection" and "A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays" , and I have found that you tested the approach on 60-sec sound files. I have tested your implemented approach ("https://github.com/thomeou/SALSA") on TAU-NIGENS Spatial Sound Events 2021 dataset , Everything is OK without any error, but the output of the network is correct only for the audios in the datasets, and for the data recorded by myself (I have recorded the data with the help of a 4-channel microphone array named ReSpeaker USB Mic Array) the output is completely wrong. I am just wondering what is wrong with my data. It is a 4-channal data, fp16, and with the same PCM coding.
Which device did you use for recording?
My data has some kind of echo, therein. Is it possible that a small amount of echo degrades the performance of your localization algorithm significantly?
I am working on sound source localization. I have read your papers entitled: "Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection" and "A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays" , and I have found that you tested the approach on 60-sec sound files. I have tested your implemented approach ("https://github.com/thomeou/SALSA") on TAU-NIGENS Spatial Sound Events 2021 dataset , Everything is OK without any error, but the output of the network is correct only for the audios in the datasets, and for the data recorded by myself (I have recorded the data with the help of a 4-channel microphone array named ReSpeaker USB Mic Array) the output is completely wrong. I am just wondering what is wrong with my data. It is a 4-channal data, fp16, and with the same PCM coding. Which device did you use for recording? My data has some kind of echo, therein. Is it possible that a small amount of echo degrades the performance of your localization algorithm significantly?
Thanks for your helping.