xanguera / BeamformIt

BeamformIt acoustic beamforming software
353 stars 111 forks source link

question about the beamformer #13

Closed weilongHSpeech closed 7 years ago

weilongHSpeech commented 7 years ago

I have read your paper and a quick look at your code. I am wondering about the deployment of the microphones in the room. Does the microphone has the arbitrary position?If it is arbitrary, so I can use arbitrary microphone array ( ULA, non uniform linear array, circular array or even spherical array) to record the signals and then put them into your beamformerIT to get the result. In terms of the delay and sum, then question comes up again, how does algorithm know the whole steering vector for calculating the beamformer?

weilongHSpeech commented 7 years ago

all right! I figure out .

xanguera commented 7 years ago

as you might have figured out, there is no constraint in the positions of the microphones. It would probably be an advantage to know the locations, but beamformit does quite well without it already.

X.

On Fri, Apr 14, 2017 at 9:44 AM, weilongHSpeech notifications@github.com wrote:

all right! I figure out .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/xanguera/BeamformIt/issues/13#issuecomment-294117839, or mute the thread https://github.com/notifications/unsubscribe-auth/AJE6_OSuKk5eePpVTGybf08ltsw8Z_fCks5rvzHrgaJpZM4M9ibp .

weilongHSpeech commented 7 years ago

Thank you very much for your reply. Yes, I figured out you do not need any position information to calculate the beamformer. By the way I am very interested in the performance of the weighted delay and sum in your paper. As I know the property of beamformer, especially about the beampattern, spatial aliasing on the high frequency part, or poor spatial selection on low frequency, depends on the array type( microphone distance , array aperture and so on). Yes you can calculate a beamformer just based on the TDOA, but without any constraint on the array type, how can the beamformer really work to make sure the performance? or even to ensure the ASR afterward? Is there any special secret on your weighted-D&S comparing to other D&S OR GSC beamformer?

xanguera commented 7 years ago

Hi, I believe there is no secret, but probably a good engineering implementation (other people might debate this point after looking at the code...). As you probably know already if you read the paper(s), BeamformIt was written suring my PhD thesis for the particular case of meeting room microphone provessing in the EU-funded AMI project. The goal was to perform speaker diarization on a single channel instead of foing it on every input channel of several microphones and then trying to figure out how to merge the results obtained from poor-quality signals. In the AMI project there were multiple types of room setups, with some arrays, some tabletops, some lapel microphones, etc. so it was important to avoid any constraint. Plain old delay&sum algorithm is at the code of BeamformIt as it does not require anything else than defining a single microphone to be the reference. BeamformIt includes also some pre and postprocessing to prepare the signal and to postprocess the TDOA's to make them better for the task at hand, but nothing else. I am very happy that so many people decided to use it ober the years, and I wish that more people decided to contribute to the code to make the algorithm better.

On Sat, Apr 15, 2017 at 12:41 PM, weilongHSpeech notifications@github.com wrote:

Thank you very much for your reply. Yes, I figured out you do not need any position information to calculate the beamformer. By the way I am very interested in the performance of the weighted delay and sum in your paper. As I know the property of beamformer, especially about the beampattern, spatial aliasing on the high frequency part, or poor spatial selection on low frequency, depends on the array type( microphone distance , array aperture and so on). Yes you can calculate a beamformer just based on the TDOA, but without any constraint on the array type, how can the beamformer really work to make sure the performance? or even to ensure the ASR afterward? Is there any special secret on your weighted-D&S comparing to other D&S OR GSC beamformer?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/xanguera/BeamformIt/issues/13#issuecomment-294288102, or mute the thread https://github.com/notifications/unsubscribe-auth/AJE6_O7Jr6VlprFqM9JVdPLOvUKz8iFyks5rwKzagaJpZM4M9ibp .

weilongHSpeech commented 7 years ago

Hi Xavier,

Thank you for your informative reply. Simply speaking as I understand, the trick in you beamformer is not the beamformer itself, which is a plain D&S beamformer. The trick part is the get the final better TDOA to inform the D&S beamformer, such wiener filtering as preprocessing and viterbi decoding as postprocessing.

The TDOA estimation itself is also the classical GCC-PHAT. I see some papers , such as paper: Subsample Time Delay Estimation via Improved GCC PHAT Algorithm, proposed to improve the GCC-PHAT. Do you expect it will bring much accuracy improvement of TDOA estimation or the whole beamformerIt using the improved GCC-PHAT or some other noval TDOA estimation?

Best, Weilong

xanguera commented 7 years ago

Sorry for late reply, I did not see you had a followup question. It would be interesting to test the new proposed flavours of GCC PHAT. I do not have much time now but I will keep it in mind. Please feel free to try yourself and report any news here.