Closed hagenw closed 8 years ago
One could also try to include the IC to look only at reliable cues as the Dietz model is doing, see https://dev.qu.tu-berlin.de/issues/1819 for a discussion.
I prepared an example that trys to predict all of our sound field synthesis localisation results.
In order to get it pull the localise_synthesized_sources
branch in TWOEARS/examples.
Then go to the folder qoe_localisation
and run:
>> sfsLocalisationPrediction(2)
PROCESS HUMAN LABEL FILE: experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt
--------------------------------------------------------------------------------------------------------------
condition experiment DnnLocationKS GmmLocationKS ItdLocationKS
--------------------------------------------------------------------------------------------------------------
wfs_nls14_X0.00_Y0.00_src_ps_xs0.00_ys2.50 -0 deg -3 deg 2 deg 1 deg
wfs_nls14_X0.00_Y0.75_src_ps_xs0.00_ys2.50 -2 deg 141 deg -4 deg -1 deg
wfs_nls14_X0.00_Y-0.75_src_ps_xs0.00_ys2.50 0 deg -2 deg -2 deg 1 deg
wfs_nls14_X-0.25_Y0.00_src_ps_xs0.00_ys2.50 -2 deg 2 deg -8 deg -4 deg
wfs_nls14_X-0.25_Y0.75_src_ps_xs0.00_ys2.50 1 deg 6 deg -9 deg 18 deg
wfs_nls14_X-0.25_Y-0.75_src_ps_xs0.00_ys2.50 -4 deg 16 deg 66 deg -4 deg
wfs_nls14_X-0.50_Y0.00_src_ps_xs0.00_ys2.50 -10 deg 176 deg -19 deg -15 deg
wfs_nls14_X-0.50_Y0.75_src_ps_xs0.00_ys2.50 -14 deg -17 deg -17 deg -15 deg
wfs_nls14_X-0.50_Y-0.75_src_ps_xs0.00_ys2.50 -8 deg 164 deg 164 deg -9 deg
[...]
You can choose between six different experiments, see help for more details. You can have a look at the figure showing the results. Every row corresponds to the number you can give to the sfsLocalisationPrediction
function. You can also have a look at the modeling results from the Dietz model for this data, which are included into this figure.
The results from the ItdLocationKS
should be similar to the one from the Dietz model. So feel free to play around with the DnnLocationKS
and GmmLocationKS
and see if you could improve the predictions.
As I said, the results for ItdLocationKS
are quite good, as I only use ITD values below 1.4 kHz which seem to dominate localisation if present (and maybe reliable).
Note, that head rotation is disabled in all cases at the moment. I tried it once, and the results were even worse, but maybe I made a mistake:
>> sfsLocalisationPrediction(2) % with head rotation
PROCESS HUMAN LABEL FILE: experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt
--------------------------------------------------------------------------------------------------------------
condition experiment DnnLocationKS GmmLocationKS ItdLocationKS
--------------------------------------------------------------------------------------------------------------
wfs_nls14_X0.00_Y0.00_src_ps_xs0.00_ys2.50 -0 deg -3 deg 140 deg 1 deg
wfs_nls14_X0.00_Y0.75_src_ps_xs0.00_ys2.50 -2 deg 64 deg 138 deg -1 deg
wfs_nls14_X0.00_Y-0.75_src_ps_xs0.00_ys2.50 0 deg -2 deg 168 deg 1 deg
wfs_nls14_X-0.25_Y0.00_src_ps_xs0.00_ys2.50 -2 deg 2 deg -156 deg -2 deg
wfs_nls14_X-0.25_Y0.75_src_ps_xs0.00_ys2.50 1 deg 118 deg 148 deg 18 deg
[...]
Hagen, I've started to look at this. Could you tell me what azimuth range the LookupTable used by ItdLocationKS considers?
I've updated the example to use the new models and the results don't look too bad, if you use MCT-DIFFUSE-FRONT. This preset forces the models to localise in the front hemifield only. There seems to be a 3-deg shift often. Were the signals created for a KEMAR head size?
>> sfsLocalisationPrediction(2)
PROCESS HUMAN LABEL FILE: experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt
--------------------------------------------------------------------------------------------------------------
condition experiment DnnLocationKS GmmLocationKS ItdLocationKS
--------------------------------------------------------------------------------------------------------------
wfs_nls14_X0.00_Y0.00_src_ps_xs0.00_ys2.50 -0 deg -3 deg -3 deg 1 deg
wfs_nls14_X0.00_Y0.75_src_ps_xs0.00_ys2.50 -2 deg 1 deg 1 deg -1 deg
wfs_nls14_X0.00_Y-0.75_src_ps_xs0.00_ys2.50 0 deg -2 deg -2 deg 1 deg
wfs_nls14_X-0.25_Y0.00_src_ps_xs0.00_ys2.50 -2 deg -18 deg -13 deg -3 deg
wfs_nls14_X-0.25_Y0.75_src_ps_xs0.00_ys2.50 1 deg -14 deg -9 deg 15 deg
wfs_nls14_X-0.25_Y-0.75_src_ps_xs0.00_ys2.50 -4 deg -4 deg 56 deg -4 deg
wfs_nls14_X-0.50_Y0.00_src_ps_xs0.00_ys2.50 -10 deg -4 deg -24 deg -15 deg
wfs_nls14_X-0.50_Y0.75_src_ps_xs0.00_ys2.50 -14 deg -17 deg -17 deg -15 deg
wfs_nls14_X-0.50_Y-0.75_src_ps_xs0.00_ys2.50 -8 deg -11 deg -11 deg -9 deg
wfs_nls14_X-0.75_Y0.00_src_ps_xs0.00_ys2.50 -18 deg -15 deg -25 deg -17 deg
wfs_nls14_X-0.75_Y0.75_src_ps_xs0.00_ys2.50 -30 deg -28 deg -53 deg -40 deg
wfs_nls14_X-0.75_Y-0.75_src_ps_xs0.00_ys2.50 -13 deg -17 deg -17 deg -14 deg
wfs_nls14_X-1.00_Y0.00_src_ps_xs0.00_ys2.50 -25 deg -26 deg -41 deg -24 deg
wfs_nls14_X-1.00_Y0.75_src_ps_xs0.00_ys2.50 -31 deg -43 deg -63 deg -31 deg
wfs_nls14_X-1.00_Y-0.75_src_ps_xs0.00_ys2.50 -18 deg -17 deg -27 deg -17 deg
wfs_nls14_X-1.25_Y0.00_src_ps_xs0.00_ys2.50 -26 deg -30 deg -35 deg -29 deg
wfs_nls28_X0.00_Y0.00_src_ps_xs0.00_ys2.50 1 deg -1 deg -1 deg 1 deg
wfs_nls28_X0.00_Y0.75_src_ps_xs0.00_ys2.50 -1 deg 0 deg 0 deg 1 deg
wfs_nls28_X0.00_Y-0.75_src_ps_xs0.00_ys2.50 -0 deg 0 deg 0 deg 1 deg
wfs_nls28_X-0.25_Y0.00_src_ps_xs0.00_ys2.50 -3 deg -11 deg -6 deg -5 deg
wfs_nls28_X-0.25_Y0.75_src_ps_xs0.00_ys2.50 -6 deg -12 deg -7 deg -6 deg
wfs_nls28_X-0.25_Y-0.75_src_ps_xs0.00_ys2.50 -2 deg -3 deg -11 deg -3 deg
wfs_nls28_X-0.50_Y0.00_src_ps_xs0.00_ys2.50 -15 deg -10 deg -10 deg -10 deg
wfs_nls28_X-0.50_Y0.75_src_ps_xs0.00_ys2.50 -18 deg -20 deg -20 deg -15 deg
Hi Ning,
cool that you took a look at this.
1.) The lookup table used by ItdLocationKS is created in the range -90° to 90°. During the fitting and mapping process it could happen that larger values are returned. This is at the moment handled by applying phi(abs(phi)>95) = NaN;
afterwards.
2.) All the signals were created using our QU_KEMAR_3m_anechoic.sofa
dataset to model the loudspeaker of the WFS/NFC-HOA setups.
3.) For all WFS examples the restriction to -90° to 90° should be fine, but for NFC-HOA and new stuff that Fiete is testing it would be good to have the full 360° circle as it can happen that some stimuli are perceived from the back.
4.) I agree that the results from GmmLocationKS
looks not too bad. Maybe as a next step, I could run the models on all my data and calculate the mean prediction error. Could you push your changes to the TWOEARS/examples, if you haven't done yet?
Just had a look at ItdLocationKS.execute() lines 67-73 and I noticed there is a bug. Here the size of phi
is 50x16, while the size of ic
is 16x50, due to the fact that transpose(itd)
is passed to obj.itdToAngle()
.
% Convert ITDs to azimuth angles
phi = obj.itdToAngle(itd',lookupTable);
% Calculate the median over time for every frequency channel of the azimuth
for n = 1:size(phi,2)
% Applay IC threshold, compare eq. 9 in Dietz (2011)
idx = ic(:,n)>obj.icThreshold & [diff(ic(:,n))>0; 0];
angle = phi(idx,n);
So inside the for-loop the indexing doesn't look right to me.
Yes I've pushed the change into the localise_synthesized_sources branch
I forgot to say I am using the master
branch of TWOEARS/blackboard-system
, since my new changes to DnnLocationKS
and GmmLocationKS
are applied there. Am I using the wrong branch?
I switched the Config back to use the localise_synthesized_sources
branch as this pull request is for that branch. I have applied all the changes from master to this one, so it should now work as well. And if we decide to make further modifications on the code, we could do it directly in the localise_synthesized_sources
branch.
I run the script on all results, here is the summarized output. As you can see, the DnnLocationKS
is not to far away from ItdLocationKS
, the mean error is in most cases 1° higher, but it could have large deviations (see max error), which seems to be better with ItdLocationKS
. So I guess, we should do some more work in order to improve it.
--------------------------------------------------------------------------------------
condition DnnLocationKS GmmLocationKS ItdLocationKS
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_linear.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 4 deg 5 deg 2 deg
Std absolute prediction error: 9 deg 7 deg 2 deg
Max absolute prediction error: 64 deg 31 deg 7 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 3 deg 8 deg 2 deg
Std absolute prediction error: 3 deg 11 deg 2 deg
Max absolute prediction error: 15 deg 60 deg 11 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 8 deg 10 deg 7 deg
Std absolute prediction error: 13 deg 15 deg 13 deg
Max absolute prediction error: 98 deg 98 deg 100 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 4 deg 12 deg 3 deg
Std absolute prediction error: 4 deg 18 deg 3 deg
Max absolute prediction error: 20 deg 67 deg 18 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 11 deg 14 deg 9 deg
Std absolute prediction error: 12 deg 16 deg 9 deg
Max absolute prediction error: 52 deg 67 deg 49 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_fs_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 15 deg 27 deg 14 deg
Std absolute prediction error: 17 deg 23 deg 12 deg
Max absolute prediction error: 75 deg 85 deg 53 deg
--------------------------------------------------------------------------------------
Hagen, I fixed a couple of small issues in DnnLocationKS that I know would produce inferior performance in the past. Could you try to run your evaluation again?
Also I added input arguments to DnnLocationKS that can be used to specify a frequency range, e.g. you could add <Param Type="int">1400</Param>
to specify only consider frequency below 1400 Hz. Not sure if this would help though.
And it looks the fact the azimuth resolution is 5-deg for the DnnLocationKS
and GmmLocationKS
caused some errors in many cases. I will see how to improve this.
I'm running the evaluation again at the moment for both with and without 1400 Hz limit.
BTW, the evaluation is simply running the sfsLocaisationPrediction
function, I added the calculation of mean etc. to this function.
In order to check the influence of the 5deg limit, it would be cool if you could train both (or one of the models) with an accuracy of 1deg and we can then just run it and compare the results.
Here are the results for using the whole frequency range, the biggest change is the reduction in maximum error for the very first condition, the rest looks similar as before:
--------------------------------------------------------------------------------------
condition DnnLocationKS GmmLocationKS ItdLocationKS
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_linear.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 5 deg 5 deg 2 deg
Std absolute prediction error: 5 deg 9 deg 2 deg
Max absolute prediction error: 20 deg 41 deg 9 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 4 deg 8 deg 2 deg
Std absolute prediction error: 4 deg 11 deg 2 deg
Max absolute prediction error: 20 deg 60 deg 12 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 9 deg 10 deg 7 deg
Std absolute prediction error: 13 deg 15 deg 13 deg
Max absolute prediction error: 83 deg 98 deg 101 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 5 deg 12 deg 3 deg
Std absolute prediction error: 5 deg 18 deg 3 deg
Max absolute prediction error: 18 deg 67 deg 18 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 10 deg 14 deg 9 deg
Std absolute prediction error: 11 deg 16 deg 9 deg
Max absolute prediction error: 52 deg 67 deg 48 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_fs_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 15 deg 27 deg 15 deg
Std absolute prediction error: 17 deg 24 deg 12 deg
Max absolute prediction error: 62 deg 85 deg 56 deg
--------------------------------------------------------------------------------------
Here are the results for DnnLocationKS
using also only frequencies up to 1400 Hz, now the results are more or less identical to the one from ItdLocationKS
(at least for the mean overall values, I haven't looked into the details of single results yet):
--------------------------------------------------------------------------------------
condition DnnLocationKS GmmLocationKS ItdLocationKS
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_linear.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 3 deg 5 deg 2 deg
Std absolute prediction error: 2 deg 8 deg 2 deg
Max absolute prediction error: 9 deg 31 deg 7 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 3 deg 8 deg 2 deg
Std absolute prediction error: 3 deg 11 deg 2 deg
Max absolute prediction error: 20 deg 60 deg 11 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 8 deg 10 deg 7 deg
Std absolute prediction error: 12 deg 15 deg 13 deg
Max absolute prediction error: 93 deg 98 deg 99 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 3 deg 12 deg 3 deg
Std absolute prediction error: 4 deg 18 deg 3 deg
Max absolute prediction error: 19 deg 67 deg 17 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 9 deg 14 deg 9 deg
Std absolute prediction error: 8 deg 16 deg 9 deg
Max absolute prediction error: 44 deg 67 deg 49 deg
--------------------------------------------------------------------------------------
experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_fs_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error: 16 deg 27 deg 15 deg
Std absolute prediction error: 16 deg 23 deg 12 deg
Max absolute prediction error: 64 deg 85 deg 46 deg
--------------------------------------------------------------------------------------
GOAL: the idea is to have one common localisation knowledge source that works in most cases.
PROBLEM: for synthesized sources (e.g. WFS) the predictions of
DnnLocationKS
andGmtkLocationKS
are a lot worse than withItdLocationKS
.We should test what is the best way to improve performance of using
DnnLocationKS
. The general problem with synthesized sources is that they have a lot of wrong ITD and ILD cues, especially at higher frequencies which are considered byDnnLocationKS
, but not byItdLocationKS
which uses only ITDs for frequency channels below 1.4 kHz.To test the model performance pull the
localise_synthesized_sources
branch of the TWOEARS/examples repo, go to theqoe_localisation
folder and runlocalisationWfsCircularPointSource
./cc @ningma97 @Hardcorehobel