QutEcoacoustics / audio-analysis

The audio analysis code (AnalysisPrograms.exe) for the QUT Ecoacoustics Research Group
https://ap.qut.ecoacoustics.info/
Apache License 2.0
52 stars 12 forks source link

The `MaxFormantGap` parameter is broken or needs clarification #471

Closed atruskie closed 3 years ago

atruskie commented 3 years ago

Actual behaviour:

The documentation for the HarmonicAlgorithm states:

https://github.com/QutEcoacoustics/audio-analysis/blob/ce858e61fea2ae669f4e805d37e8966fdb5602a5/docs/technical/apidoc/HarmonicParameters.md#L42-L47

But I've found that the MaxFormatGap parameter must be set to values much larger than the size of the gap between formants for the algorithm to detect anything.

Expected behavior:

Clarify the intended use of MaxFormantGapt in the documentation or treat this as a bug and ensure the code works as advertised.

How to reproduce this bug:

  1. Create a generic recognizer that used a harmonic algorithm detector
  2. Apply it a recording with a harmonic
  3. Examine detection results when MaxFormantGap is set appropriately versus when it is set to approximately the total bandwidth of the target event.

Additional Details

image

Config excerpt:


    Bark: !HarmonicParameters
        <<: *common_parameters

        # min and max of the freq band to search
        MinHertz: 350
        MaxHertz: 2700
        MinDuration: 0.1
        MaxDuration: 0.3
        MinFormantGap: 150
        MaxFormantGap: 2400
        DctThreshold: 0.15
        # Scan the frequency band at these thresholds
        DecibelThresholds:
          - 3.0
          - 6.0
          - 9.0

Note the harmonic detector works, but the MaxFormantGap is set to a value that is much larger the true max formant gap (which would be about 300 Hz)

atruskie commented 3 years ago

Full config file: https://github.com/QutEcoacoustics/audio-analysis/blob/2ffce6fe834b2833afb7807c13929d1bf2eb7b45/src/AnalysisConfigFiles/RecognizerConfigFiles/Truskinger.PetaurusBreviceps.yml#L69

ksh23_1766_510096_20171102_170621_30_0_3054_3084_0-0.5min.wav kmu16_1817_507136_20171029_103134_30_0_161_191_0-0.5min.wav

towsey commented 3 years ago

I believe I have fixed the the problem identified in this issue. The problem lay in determining the maximum DCT coefficient.