Open nateanl opened 6 years ago
The comparations among different combinations of the masks.
Here are all channels for the LSTM masks.
Channels 3-6 are very similar to each other, while channels 1 and 2 are quite different. Any idea why?
The speech in channel 2 is hidden. I can barely listen to the speaker. I checked the failure array, none of the microphones is failed. You can listen to them: /home/data/CHiME3/data/audio/16kHz/isolated/dt05_str_real/F01_050C0108_STR.CH2.wav /home/data/CHiME3/data/audio/16kHz/isolated/dt05_str_real/F01_050C0107_STR.CH2.wav /home/data/CHiME3/data/audio/16kHz/isolated/dt05_str_real/F01_050C0106_STR.CH2.wav /home/data/CHiME3/data/audio/16kHz/isolated/dt05_str_real/F01_050C0103_STR.CH2.wav /home/data/CHiME3/data/audio/16kHz/isolated/dt05_str_real/F01_050C010B_STR.CH2.wav /home/data/CHiME3/data/audio/16kHz/isolated/dt05_str_real/F01_050C010A_STR.CH2.wav /home/data/CHiME3/data/audio/16kHz/isolated/dt05_str_real/F01_050C0102_STR.CH2.wav
Because the energy of the noise is very high, we can't exclude the channel in mask combination and beamforming. Any suggestion on this case?
I don't understand what you mean. We can use all of the channels, we want to use all of the channels.
On Mar 11, 2018 11:16 PM, "Zhaoheng Ni" notifications@github.com wrote:
Because the energy of the noise is very high, we can't exclude the channel in mask combination and beamforming. Any suggestion on this case?
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/speechLabBcCuny/messlJsalt15/issues/25#issuecomment-372183112, or mute the thread https://github.com/notifications/unsubscribe-auth/AALbHt_5nPJJFb5sJqW829Vr8Kq4Ohlnks5tdeipgaJpZM4Slii8 .
I was thinking if the speech is hidden in that microphone, we should find some way to detect it and make it fail. Maybe I was wrong. I can check the IAF mask to see the difference.
In that case the model should predict a mask of all 0s. We shouldn't have to do anything differently to it.
On Mon, Mar 12, 2018 at 10:07 AM, Zhaoheng Ni notifications@github.com wrote:
I was thinking if the speech is hidden in that microphone, we should find some way to detect it and make it fail. Maybe I was wrong. I can check the IAF mask to see the difference.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/speechLabBcCuny/messlJsalt15/issues/25#issuecomment-372321629, or mute the thread https://github.com/notifications/unsubscribe-auth/AALbHsD8qdPctsqwCl6vvJlnoflLYl-0ks5tdoE5gaJpZM4Slii8 .
These are the speech mask, noise mask, and the post-filter generated from 6-channel LSTM masks for one utterance. speech mask and noise mask: 119016 out of 119529 points are different. speech mask and post-filter: 119249 out of 119529 points are different. noise mask and post-filter: 119249 out of 119529 points are different.
Sorry, I guess you want to know if the masks of channel 3-6 are the same. I compared them respectively, and they are different too.
Ok, I guess they are very slightly different. Let's see what we get running them through the system.
I tested the stubI_MesslKeras code on 7 dt05_str_real files. Here are the results.