TWOEARS / blackboard-system

Two!Ears Auditory Model - Blackboard system module
http://docs.twoears.eu/en/latest/blackboard/
GNU General Public License v2.0
3 stars 2 forks source link

GmmLocationKS returns NaN #14

Closed hagenw closed 8 years ago

hagenw commented 8 years ago

In the master branch of TWOEARS/examples go to the folder localisation_GMMs and run the following:

>> localise

-------------------------------------------------------------------------
Source direction   GmmLocationKS w head rot.   GmmLocationKS wo head rot.
-------------------------------------------------------------------------
        0               NaN                       -180
      -52               NaN                       -135
     -131               NaN                       -135
        0               NaN                       -180
       30               NaN                         30
      -30               NaN                        -30
------------------------------------------------------------------------

There should be no NaN values.

hagenw commented 8 years ago

If I run the same command in the remove_gmtk branch of TWOEARS/examples I get:

>> localise

-------------------------------------------------------------------------
Source direction   GmmLocationKS w head rot.   GmmLocationKS wo head rot.
-------------------------------------------------------------------------
        0                 0                       -180
      -52               -55                        -93
     -131              -140                       -140
        0                 0                       -180
       30                30                         30
      -30               -30                        -30
------------------------------------------------------------------------

Does this mean GmmLocationKS is trained for 16 kHz? This is the only setting that changed between the two branches. If 16 kHz is required could you please integrate an error or warning message int GmmLocationKS if this requirement is not full-filled?

ningma97 commented 8 years ago

I realised this as well. This is due to the fact that the second 0.5-sec block is still not valid. If you slightly increase the stimuli length (from 1 sec to 1.1 sec), as it is done in the remove_gmtk branch, then it is working OK.

Looks like Ivo's fix regarding signal retrieving is still not 100% right, but I couldn't work out what's wrong.

Also I notice if you just perform localisation once, say just run the 4th one (0-deg source angle) alone and quit, the signal looks slightly different from that of the same source angle but localisation is performed for all the angles one after another. Perhaps some clean-up code is not properly done.

hagenw commented 8 years ago

Ah, ok, I just had the same problem with testing things for the Kopco scenario (1s vs. 1.1s), I created a new issue for this: #15.

Your last point with the difference between running it once or in a loop had been popping up before and it was somehow a feature not a bug. But I'm also not so sure if we shouldn't change this. Maybe you could create a new issue for this and provide a minimal example for testing it.

Coming back to the original question: 1.) Does GmmLocationKS expects 16 kHz or 44.1 kHz?

ningma97 commented 8 years ago

It's trained on 16kHz signals but I don't think it matters hugely if tested on 44.1kHz, since it only uses ITDs and ILDs.

On 3 March 2016 at 15:33, Hagen Wierstorf notifications@github.com wrote:

Ah, ok, I just had the same problem with testing things for the Kopco scenario (1s vs. 1.1s), I created a new issue for this: #15 https://github.com/TWOEARS/blackboard-system/issues/15.

Your last point with the difference between running it once or in a loop had been popping up before and it was somehow a feature not a bug. But I'm also not so sure if we shouldn't change this. Maybe you could create a new issue for this and provide a minimal example for testing it.

Coming back to the original question: 1.) Does GmmLocationKS expects 16 kHz or 44.1 kHz?

— Reply to this email directly or view it on GitHub https://github.com/TWOEARS/blackboard-system/issues/14#issuecomment-191816281 .

Hardcorehobel commented 8 years ago

I agree. We should ensure that the DNN-based system always gets the cross-correlation function with the correct number of lags (independent of the input sampling frequency), either by resampling the input or by down-sampling the cross-correlation function.

hagenw commented 8 years ago

I tested it and the results for 44.1kHz input signals and 16kHz input signals are indeed identical. This means GmmLocationKS uses only cues from low frequency channels that are not affected by the resampling, correct?

ivo--t commented 8 years ago

@ningma97 , can you please explain to me in more detail the statement "This is due to the fact that the second 0.5-sec block is still not valid"? Like, is it too short? Too long? Is it including "silence" from after the end of the sound you're putting through the binaural simulator?

ivo--t commented 8 years ago

@ningma97 For your suspicion with the clean-up code, can you create a new issue and describe the steps to recreate this? Thanks :).

ivo--t commented 8 years ago

I can not recreate your first bug, @hagenw and @ningma97 . I run localise in localisation_GMMs, master branch, with LengthOfSimulation set to 1 (not 1.1), and get no NaNs at all. Furthermore I debugged the run in the GmmLocalisationKS and checked which blocks you get with my getNextSignalBlock: First call you get 1:50, next call you get 51:100, just as you should.

Result is:

-------------------------------------------------------------------------
Source direction   GmmLocationKS w head rot.   GmmLocationKS wo head rot.
-------------------------------------------------------------------------
        0                 0                       -180
      -52               -50                       -140
     -131              -140                       -138
        0                 0                       -180
       30                30                         30
      -30               -30                        -30
------------------------------------------------------------------------

I use the most up-to-date master of binSim, AFE, and BlackboardSystem (didn't want to abreviate BS ^^).

hagenw commented 8 years ago

The things discussed here regarding the sampling rate should be continued at #10.

@ivo--t you are absolutely correct, again I missed to pull the blackboard-system. Now I'm able to run the demo with a time of 1.0 as well. NaN pops up, when you lower the time to 0.92 or below. Is this ok, or should it in theory work for values down to around 0.51?

ivo--t commented 8 years ago

No, it shouldn't work with such low values, because WITH head rotation the KS needs two iterations of predicting, thus two blocks. @ningma97 , correct me if I'm wrong.

hagenw commented 8 years ago

That sounds reasonable, GmmLocationKS without head rotations fails only for signal length < 0.46.