TWOEARS / blackboard-system

Two!Ears Auditory Model - Blackboard system module
http://docs.twoears.eu/en/latest/blackboard/
GNU General Public License v2.0
3 stars 2 forks source link

Localisation of synthesized sources #2

Closed hagenw closed 8 years ago

hagenw commented 8 years ago

GOAL: the idea is to have one common localisation knowledge source that works in most cases.

PROBLEM: for synthesized sources (e.g. WFS) the predictions of DnnLocationKS and GmtkLocationKS are a lot worse than with ItdLocationKS.

We should test what is the best way to improve performance of using DnnLocationKS. The general problem with synthesized sources is that they have a lot of wrong ITD and ILD cues, especially at higher frequencies which are considered by DnnLocationKS, but not by ItdLocationKS which uses only ITDs for frequency channels below 1.4 kHz.

To test the model performance pull the localise_synthesized_sources branch of the TWOEARS/examples repo, go to the qoe_localisation folder and run localisationWfsCircularPointSource.

/cc @ningma97 @Hardcorehobel

hagenw commented 8 years ago

One could also try to include the IC to look only at reliable cues as the Dietz model is doing, see https://dev.qu.tu-berlin.de/issues/1819 for a discussion.

hagenw commented 8 years ago

I prepared an example that trys to predict all of our sound field synthesis localisation results. In order to get it pull the localise_synthesized_sources branch in TWOEARS/examples. Then go to the folder qoe_localisation and run:

>> sfsLocalisationPrediction(2)

PROCESS HUMAN LABEL FILE: experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt

--------------------------------------------------------------------------------------------------------------
condition                    experiment      DnnLocationKS   GmmLocationKS   ItdLocationKS
--------------------------------------------------------------------------------------------------------------
wfs_nls14_X0.00_Y0.00_src_ps_xs0.00_ys2.50     -0 deg      -3 deg       2 deg       1 deg
wfs_nls14_X0.00_Y0.75_src_ps_xs0.00_ys2.50     -2 deg     141 deg      -4 deg      -1 deg
wfs_nls14_X0.00_Y-0.75_src_ps_xs0.00_ys2.50     0 deg      -2 deg      -2 deg       1 deg
wfs_nls14_X-0.25_Y0.00_src_ps_xs0.00_ys2.50    -2 deg       2 deg      -8 deg      -4 deg
wfs_nls14_X-0.25_Y0.75_src_ps_xs0.00_ys2.50     1 deg       6 deg      -9 deg      18 deg
wfs_nls14_X-0.25_Y-0.75_src_ps_xs0.00_ys2.50   -4 deg      16 deg      66 deg      -4 deg
wfs_nls14_X-0.50_Y0.00_src_ps_xs0.00_ys2.50   -10 deg     176 deg     -19 deg     -15 deg
wfs_nls14_X-0.50_Y0.75_src_ps_xs0.00_ys2.50   -14 deg     -17 deg     -17 deg     -15 deg
wfs_nls14_X-0.50_Y-0.75_src_ps_xs0.00_ys2.50   -8 deg     164 deg     164 deg      -9 deg
[...]

You can choose between six different experiments, see help for more details. You can have a look at the figure showing the results. Every row corresponds to the number you can give to the sfsLocalisationPrediction function. You can also have a look at the modeling results from the Dietz model for this data, which are included into this figure.

The results from the ItdLocationKS should be similar to the one from the Dietz model. So feel free to play around with the DnnLocationKS and GmmLocationKS and see if you could improve the predictions.

As I said, the results for ItdLocationKS are quite good, as I only use ITD values below 1.4 kHz which seem to dominate localisation if present (and maybe reliable).

Note, that head rotation is disabled in all cases at the moment. I tried it once, and the results were even worse, but maybe I made a mistake:

>> sfsLocalisationPrediction(2)  % with head rotation

PROCESS HUMAN LABEL FILE: experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt

--------------------------------------------------------------------------------------------------------------
condition                    experiment      DnnLocationKS   GmmLocationKS   ItdLocationKS
--------------------------------------------------------------------------------------------------------------
wfs_nls14_X0.00_Y0.00_src_ps_xs0.00_ys2.50     -0 deg      -3 deg     140 deg       1 deg
wfs_nls14_X0.00_Y0.75_src_ps_xs0.00_ys2.50     -2 deg      64 deg     138 deg      -1 deg
wfs_nls14_X0.00_Y-0.75_src_ps_xs0.00_ys2.50     0 deg      -2 deg     168 deg       1 deg
wfs_nls14_X-0.25_Y0.00_src_ps_xs0.00_ys2.50    -2 deg       2 deg    -156 deg      -2 deg
wfs_nls14_X-0.25_Y0.75_src_ps_xs0.00_ys2.50     1 deg     118 deg     148 deg      18 deg
[...]
ningma97 commented 8 years ago

Hagen, I've started to look at this. Could you tell me what azimuth range the LookupTable used by ItdLocationKS considers?

ningma97 commented 8 years ago

I've updated the example to use the new models and the results don't look too bad, if you use MCT-DIFFUSE-FRONT. This preset forces the models to localise in the front hemifield only. There seems to be a 3-deg shift often. Were the signals created for a KEMAR head size?

>> sfsLocalisationPrediction(2)

PROCESS HUMAN LABEL FILE: experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt

--------------------------------------------------------------------------------------------------------------
condition                    experiment      DnnLocationKS   GmmLocationKS   ItdLocationKS
--------------------------------------------------------------------------------------------------------------
wfs_nls14_X0.00_Y0.00_src_ps_xs0.00_ys2.50     -0 deg      -3 deg      -3 deg       1 deg
wfs_nls14_X0.00_Y0.75_src_ps_xs0.00_ys2.50     -2 deg       1 deg       1 deg      -1 deg
wfs_nls14_X0.00_Y-0.75_src_ps_xs0.00_ys2.50     0 deg      -2 deg      -2 deg       1 deg
wfs_nls14_X-0.25_Y0.00_src_ps_xs0.00_ys2.50    -2 deg     -18 deg     -13 deg      -3 deg
wfs_nls14_X-0.25_Y0.75_src_ps_xs0.00_ys2.50     1 deg     -14 deg      -9 deg      15 deg
wfs_nls14_X-0.25_Y-0.75_src_ps_xs0.00_ys2.50       -4 deg      -4 deg      56 deg      -4 deg
wfs_nls14_X-0.50_Y0.00_src_ps_xs0.00_ys2.50   -10 deg      -4 deg     -24 deg     -15 deg
wfs_nls14_X-0.50_Y0.75_src_ps_xs0.00_ys2.50   -14 deg     -17 deg     -17 deg     -15 deg
wfs_nls14_X-0.50_Y-0.75_src_ps_xs0.00_ys2.50       -8 deg     -11 deg     -11 deg      -9 deg
wfs_nls14_X-0.75_Y0.00_src_ps_xs0.00_ys2.50   -18 deg     -15 deg     -25 deg     -17 deg
wfs_nls14_X-0.75_Y0.75_src_ps_xs0.00_ys2.50   -30 deg     -28 deg     -53 deg     -40 deg
wfs_nls14_X-0.75_Y-0.75_src_ps_xs0.00_ys2.50      -13 deg     -17 deg     -17 deg     -14 deg
wfs_nls14_X-1.00_Y0.00_src_ps_xs0.00_ys2.50   -25 deg     -26 deg     -41 deg     -24 deg
wfs_nls14_X-1.00_Y0.75_src_ps_xs0.00_ys2.50   -31 deg     -43 deg     -63 deg     -31 deg
wfs_nls14_X-1.00_Y-0.75_src_ps_xs0.00_ys2.50      -18 deg     -17 deg     -27 deg     -17 deg
wfs_nls14_X-1.25_Y0.00_src_ps_xs0.00_ys2.50   -26 deg     -30 deg     -35 deg     -29 deg
wfs_nls28_X0.00_Y0.00_src_ps_xs0.00_ys2.50      1 deg      -1 deg      -1 deg       1 deg
wfs_nls28_X0.00_Y0.75_src_ps_xs0.00_ys2.50     -1 deg       0 deg       0 deg       1 deg
wfs_nls28_X0.00_Y-0.75_src_ps_xs0.00_ys2.50    -0 deg       0 deg       0 deg       1 deg
wfs_nls28_X-0.25_Y0.00_src_ps_xs0.00_ys2.50    -3 deg     -11 deg      -6 deg      -5 deg
wfs_nls28_X-0.25_Y0.75_src_ps_xs0.00_ys2.50    -6 deg     -12 deg      -7 deg      -6 deg
wfs_nls28_X-0.25_Y-0.75_src_ps_xs0.00_ys2.50       -2 deg      -3 deg     -11 deg      -3 deg
wfs_nls28_X-0.50_Y0.00_src_ps_xs0.00_ys2.50   -15 deg     -10 deg     -10 deg     -10 deg
wfs_nls28_X-0.50_Y0.75_src_ps_xs0.00_ys2.50   -18 deg     -20 deg     -20 deg     -15 deg
hagenw commented 8 years ago

Hi Ning,

cool that you took a look at this.

1.) The lookup table used by ItdLocationKS is created in the range -90° to 90°. During the fitting and mapping process it could happen that larger values are returned. This is at the moment handled by applying phi(abs(phi)>95) = NaN; afterwards.

2.) All the signals were created using our QU_KEMAR_3m_anechoic.sofa dataset to model the loudspeaker of the WFS/NFC-HOA setups.

3.) For all WFS examples the restriction to -90° to 90° should be fine, but for NFC-HOA and new stuff that Fiete is testing it would be good to have the full 360° circle as it can happen that some stimuli are perceived from the back.

4.) I agree that the results from GmmLocationKS looks not too bad. Maybe as a next step, I could run the models on all my data and calculate the mean prediction error. Could you push your changes to the TWOEARS/examples, if you haven't done yet?

ningma97 commented 8 years ago

Just had a look at ItdLocationKS.execute() lines 67-73 and I noticed there is a bug. Here the size of phi is 50x16, while the size of ic is 16x50, due to the fact that transpose(itd) is passed to obj.itdToAngle().

            % Convert ITDs to azimuth angles
            phi = obj.itdToAngle(itd',lookupTable);
            % Calculate the median over time for every frequency channel of the azimuth
            for n = 1:size(phi,2)
                % Applay IC threshold, compare eq. 9 in Dietz (2011)
                idx = ic(:,n)>obj.icThreshold & [diff(ic(:,n))>0; 0];
                angle = phi(idx,n);

So inside the for-loop the indexing doesn't look right to me.

ningma97 commented 8 years ago

Yes I've pushed the change into the localise_synthesized_sources branch

ningma97 commented 8 years ago

I forgot to say I am using the master branch of TWOEARS/blackboard-system, since my new changes to DnnLocationKS and GmmLocationKS are applied there. Am I using the wrong branch?

hagenw commented 8 years ago

I switched the Config back to use the localise_synthesized_sources branch as this pull request is for that branch. I have applied all the changes from master to this one, so it should now work as well. And if we decide to make further modifications on the code, we could do it directly in the localise_synthesized_sources branch.

hagenw commented 8 years ago

I run the script on all results, here is the summarized output. As you can see, the DnnLocationKS is not to far away from ItdLocationKS, the mean error is in most cases 1° higher, but it could have large deviations (see max error), which seems to be better with ItdLocationKS. So I guess, we should do some more work in order to improve it.

--------------------------------------------------------------------------------------
condition                                DnnLocationKS   GmmLocationKS   ItdLocationKS
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_linear.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                           4 deg     5 deg       2 deg
Std absolute prediction error:                            9 deg     7 deg       2 deg
Max absolute prediction error:                           64 deg    31 deg       7 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                           3 deg     8 deg       2 deg
Std absolute prediction error:                            3 deg    11 deg       2 deg
Max absolute prediction error:                           15 deg    60 deg      11 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                           8 deg    10 deg       7 deg
Std absolute prediction error:                           13 deg    15 deg      13 deg
Max absolute prediction error:                           98 deg    98 deg     100 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                           4 deg    12 deg       3 deg
Std absolute prediction error:                            4 deg    18 deg       3 deg
Max absolute prediction error:                           20 deg    67 deg      18 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                          11 deg    14 deg       9 deg
Std absolute prediction error:                           12 deg    16 deg       9 deg
Max absolute prediction error:                           52 deg    67 deg      49 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_fs_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                          15 deg    27 deg      14 deg
Std absolute prediction error:                           17 deg    23 deg      12 deg
Max absolute prediction error:                           75 deg    85 deg      53 deg
--------------------------------------------------------------------------------------
ningma97 commented 8 years ago

Hagen, I fixed a couple of small issues in DnnLocationKS that I know would produce inferior performance in the past. Could you try to run your evaluation again?

Also I added input arguments to DnnLocationKS that can be used to specify a frequency range, e.g. you could add <Param Type="int">1400</Param> to specify only consider frequency below 1400 Hz. Not sure if this would help though.

And it looks the fact the azimuth resolution is 5-deg for the DnnLocationKS and GmmLocationKS caused some errors in many cases. I will see how to improve this.

hagenw commented 8 years ago

I'm running the evaluation again at the moment for both with and without 1400 Hz limit. BTW, the evaluation is simply running the sfsLocaisationPrediction function, I added the calculation of mean etc. to this function.

In order to check the influence of the 5deg limit, it would be cool if you could train both (or one of the models) with an accuracy of 1deg and we can then just run it and compare the results.

hagenw commented 8 years ago

Here are the results for using the whole frequency range, the biggest change is the reduction in maximum error for the very first condition, the rest looks similar as before:

--------------------------------------------------------------------------------------
condition                                DnnLocationKS   GmmLocationKS   ItdLocationKS
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_linear.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                    5 deg        5 deg       2 deg
Std absolute prediction error:                     5 deg        9 deg       2 deg
Max absolute prediction error:                    20 deg       41 deg       9 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                    4 deg        8 deg       2 deg
Std absolute prediction error:                     4 deg       11 deg       2 deg
Max absolute prediction error:                    20 deg       60 deg      12 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                    9 deg       10 deg       7 deg
Std absolute prediction error:                    13 deg       15 deg      13 deg
Max absolute prediction error:                    83 deg       98 deg     101 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                    5 deg       12 deg       3 deg
Std absolute prediction error:                     5 deg       18 deg       3 deg
Max absolute prediction error:                    18 deg       67 deg      18 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                   10 deg       14 deg       9 deg
Std absolute prediction error:                    11 deg       16 deg       9 deg
Max absolute prediction error:                    52 deg       67 deg      48 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_fs_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                   15 deg       27 deg      15 deg
Std absolute prediction error:                    17 deg       24 deg      12 deg
Max absolute prediction error:                    62 deg       85 deg      56 deg
--------------------------------------------------------------------------------------
hagenw commented 8 years ago

Here are the results for DnnLocationKS using also only frequencies up to 1400 Hz, now the results are more or less identical to the one from ItdLocationKS (at least for the mean overall values, I haven't looked into the details of single results yet):

--------------------------------------------------------------------------------------
condition                                DnnLocationKS   GmmLocationKS   ItdLocationKS
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_linear.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                    3 deg        5 deg       2 deg
Std absolute prediction error:                     2 deg        8 deg       2 deg
Max absolute prediction error:                     9 deg       31 deg       7 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                    3 deg        8 deg       2 deg
Std absolute prediction error:                     3 deg       11 deg       2 deg
Max absolute prediction error:                    20 deg       60 deg      11 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_ps_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                    8 deg       10 deg       7 deg
Std absolute prediction error:                    12 deg       15 deg      13 deg
Max absolute prediction error:                    93 deg       98 deg      99 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                    3 deg       12 deg       3 deg
Std absolute prediction error:                     4 deg       18 deg       3 deg
Max absolute prediction error:                    19 deg       67 deg      17 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_nfchoa_pw_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                    9 deg       14 deg       9 deg
Std absolute prediction error:                     8 deg       16 deg       9 deg
Max absolute prediction error:                    44 deg       67 deg      49 deg
--------------------------------------------------------------------------------------

experiments/2013-11-01_sfs_localisation/human_label_localization_wfs_fs_circular.txt
--------------------------------------------------------------------------------------
Mean absolute prediction error:                   16 deg       27 deg      15 deg
Std absolute prediction error:                    16 deg       23 deg      12 deg
Max absolute prediction error:                    64 deg       85 deg      46 deg
--------------------------------------------------------------------------------------