benbiles commented 5 years ago

I successfully built app from source but the wav files are all 44 bytes. Is the record function working in the latest source?

also what scale / measurement is mu setting ? the config I started from has 65mm distance between mics

mic 1 mu = ( -0.0405, +0.0000, +0.0000

mic 2 mu = ( +0.0000, +0.0405, +0.0000 );

i'd like to set the mics 20mm apart, or mu is only describing relative position from center?

# Microphone 1  
    { 
        mu = ( -0.0100, +0.0000, +0.0000 ); 
        sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
        direction = ( +0.000, +0.000, +1.000 );
        angle = ( 80.0, 90.0 );
    },

    # Microphone 2
    { 
        mu = ( +0.0000, +0.0100, +0.0000 ); 
        sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
        direction = ( +0.000, +0.000, +1.000 );
        angle = ( 80.0, 90.0 );
    },

    # Microphone 3
    { 
        mu = ( +0.0100, +0.0000, +0.0000 ); 
        sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
        direction = ( +0.000, +0.000, +1.000 );
        angle = ( 80.0, 90.0 );
    },

    # Microphone 4
    { 
        mu = ( +0.0000, -0.0100, +0.0000 ); 
        sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
        direction = ( +0.000, +0.000, +1.000 );
        angle = ( 80.0, 90.0 );
    }

GodCed commented 5 years ago

By latest source, do you mean from master or the latest release? Latest release had a bug in certain locales, which is fixed in master. Symptom was the no such file or directory you experienced.

Regardless, recording function do work. Are you sure you configured your ODAS sinks by the instructions in the read me? It seems recording starts but no audio is received, hence the 44 bytes of WAV file header.

mu setting for the microphones is in meters. It is the microphone position from an origin you decide. It can be whatever but it must be consistant among microphones. Frequently used origin I've seen are Mic 1 and geometrical center of the microphone array.

benbiles commented 5 years ago

I just did git clone https://github.com/introlab/odas_web.git

my config....


# Configuration file for bbbox 4 mic array

version = "2.1";

# Raw

raw: 
{

    fS = 16000;
    hopSize = 64;
    nBits = 16;
    nChannels = 4; 

    # Input with raw signal from microphones
    interface: {
        type = "soundcard";
        card = 3;
        device = 0;
    }

}

# Mapping

mapping:
{

    map: (1, 2, 3, 4);

}

# General

general:
{

    epsilon = 1E-20;

    size: 
    {
        hopSize = 64;
        frameSize = 128;
    };

    samplerate:
    {
        mu = 16000;
        sigma2 = 0.01;
    };

    speedofsound:
    {
        mu = 343.0;
        sigma2 = 25.0;
    };

    mics = (

        # Microphone 1  // mics 60mm appart
 conf 
        { 
            mu = ( -0.0300, +0.0000, +0.0000 ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 80.0, 90.0 );
        },

        # Microphone 2
        { 
            mu = ( +0.0000, +0.0300, +0.0000 ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 80.0, 90.0 );
        },

        # Microphone 3
        { 
            mu = ( +0.0300, +0.0000, +0.0000 ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 80.0, 90.0 );
        },

        # Microphone 4
        { 
            mu = ( +0.0000, -0.0300, +0.0000 ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 80.0, 90.0 );
        }

    );

    # Spatial filters to include only a range of direction if required
    # (may be useful to remove false detections from the floor, or
    # limit the space search to a restricted region)
    spatialfilters = (

        {

            direction = ( +0.000, +0.000, +1.000 );
            angle = (80.0, 90.0);  

        }

    );  

    nThetas = 181;
    gainMin = 0.25;

};

# Stationnary noise estimation

sne:
{

    b = 3;
    alphaS = 0.1;
    L = 150;
    delta = 3.0;
    alphaD = 0.1;

}

# Sound Source Localization

ssl:
{

    nPots = 4;
    nMatches = 10;
    probMin = 0.5;
    nRefinedLevels = 1;
    interpRate = 4;

    # Number of scans: level is the resolution of the sphere
    # and delta is the size of the maximum sliding window
    # (delta = -1 means the size is automatically computed)
    scans = (
        { level = 2; delta = -1; },
        { level = 4; delta = -1; }
    );

    # Output to export potential sources
    potential: {

        # format = "undefined";
        format = "json";

        interface: {
            # type = "blackhole";
            type = "socket"; ip = "127.0.0.1"; port = 9001;
        };
    };

};

# Sound Source Tracking

sst:
{  

    # Mode is either "kalman" or "particle"

    mode = "particle";

    # Add is either "static" or "dynamic"

    add = "dynamic";

    # Parameters used by both the Kalman and particle filter

    active = (
        { weight = 1.0; mu = 0.3; sigma2 = 0.0025 }
    );

    inactive = (
        { weight = 1.0; mu = 0.15; sigma2 = 0.0025 }
    );

    sigmaR2_prob = 0.0025;
    sigmaR2_active = 0.0225;
    sigmaR2_target = 0.0025;
    Pfalse = 0.1;
    Pnew = 0.1;
    Ptrack = 0.8;

    theta_new = 0.9;
    N_prob = 5;
    theta_prob = 0.8;
    N_inactive = ( 150, 200, 250, 250 );
    theta_inactive = 0.9;

    # Parameters used by the Kalman filter only

    kalman: {

        sigmaQ = 0.001;

    };

    # Parameters used by the particle filter only

    particle: {

        nParticles = 1000;
        st_alpha = 2.0;
        st_beta = 0.04;
        st_ratio = 0.5;
        ve_alpha = 0.05;
        ve_beta = 0.2;
        ve_ratio = 0.3;
        ac_alpha = 0.5;
        ac_beta = 0.2;
        ac_ratio = 0.2;
        Nmin = 0.7;

    };

    target: ();

    # Output to export tracked sources
    tracked: {

        # format = "undefined";
        format = "json";

        interface: {
            # type = "blackhole";
            type = "socket"; ip = "127.0.0.1"; port = 9000;
        };

    };

}

sss:
{

    # Mode is either "dds", "dgss" or "dmvdr"

    mode_sep = "dds";
    mode_pf = "ms";

    gain_sep = 1.0;
    gain_pf = 10.0;

    dds: {

    };

    dgss: {

        mu = 0.01;
        lambda = 0.5;

    };

    dmvdr: {

    };

    ms: {

        alphaPmin = 0.07;
        eta = 0.5;
        alphaZ = 0.8;        
        thetaWin = 0.3;
        alphaWin = 0.3;
        maxAbsenceProb = 0.9;
        Gmin = 0.01;
        winSizeLocal = 3;
        winSizeGlobal = 23;
        winSizeFrame = 256;

    };

    ss: {

        Gmin = 0.01;
        Gmid = 0.9;
        Gslope = 10.0;

    }

    separated: {

        fS = 16000;
        hopSize = 64;
        nBits = 16;        

        interface: {
            type = "file";
            path = "separated.raw";
        }        

    };

    postfiltered: {

        fS = 16000;
        hopSize = 64;
        nBits = 16;        

        interface: {
            type = "file";
            path = "postfiltered.raw";
        }        

    };

}

classify:
{

    frameSize = 1024;
    winSize = 3;
    tauMin = 32;
    tauMax = 200;
    deltaTauMax = 7;
    alpha = 0.3;
    gamma = 0.05;
    phiMin = 0.15;
    r0 = 0.2;    

    category: {

        format = "undefined";

        interface: {
            type = "blackhole";
        }

    }

}

The direction finding works really well but like you say there is probebly something wrong in the record settings?

The config template I was using described mics 81mm apart where the actual hardware is 65mm so that confused me a bit :)

benbiles commented 5 years ago

I just noticed I get these messages in the terminal;

Recorder 1 started Recorder 1 started Recorder 1 was false active Recorder 1 was false active Recorder 2 started Recorder 2 started Recorder 2 was false active Recorder 2 was false active Recorder 2 ended Recorder 2 ended Registering header on recorder 2 Registering header on recorder 2 Registered header on recorder 2 Registered header on recorder 2 Recorder 2 undefined Recorder 2 undefined Recorder 1 ended Recorder 1 ended Registering header on recorder 1 Registering header on recorder 1 Registered header on recorder 1 Recorder 1 undefined Registered header on recorder 1 Recorder 1 undefined Recorder 1 started Recorder 1 started Recorder 1 was false active Recorder 1 was false active Recorder 1 ended Recorder 1 ended Registering header on recorder 1 Registering header on recorder 1 Registered header on recorder 1 Registered header on recorder 1 Recorder 1 undefined Recorder 1 undefined Recorder 1 started Recorder 1 started Recorder 1 was false active Recorder 1 was false active Recorder 2 started Recorder 2 was false active Recorder 2 started Recorder 2 was false active Recorder 2 ended Recorder 2 ended Registering header on recorder 2 Registering header on recorder 2 Registered header on recorder 2 Registered header on recorder 2 Recorder 2 undefined Recorder 2 undefined Recorder 1 ended Recorder 1 ended Registering header on recorder 1 Registering header on recorder 1 Registered header on recorder 1 Registered header on recorder 1 Recorder 1 undefined Recorder 1 undefined Recorder 1 started Recorder 1 was false active Recorder 1 started Recorder 1 was false active Recorder 1 ended Recorder 1 ended Registering header on recorder 1 Registering header on recorder 1 Registered header on recorder 1 Recorder 1 undefined Registered header on recorder 1 Recorder 1 undefined

GodCed commented 5 years ago

Yes, your sinks for the separated audio are misconfigured. The interface type is defined as file instead of socket. You should see various WAV files in your ODAS folder, close to the compiled executable.

Please see the ODAS Studio documentation here to configure your sinks properly. You'll want to take a look at the SSS section.

The console output you see is debugging information regarding the state of the WAV files recorder. Each recorder is attached to a tracked source. So each time a new source appears, you'll see the following cycle:

Recorder N started (tracking info received);
Recorder N was false active (recorder was not already recording. Should never happen but this was a problem some time ago);
Recorder N ended (source disappeared from tracking);
Registered header on recorder N (WAV file header was written);
Recorder N undefined (recorder is cleared because the record task is finished.

benbiles commented 5 years ago

Thanks so much for your help! Its all working great with my own 60x60mm 4 channel array :) I'm interested in adding some kind of voice print recognition so that individual speakers can be identified in a room. I guess it would involve some kind of voice training. Any ideas? I suppose I should learn JSON so I can point the voice identification software to ODAS server. Anyway , thanks for help , this was not a bug so I closed the issue.

GodCed commented 5 years ago

Hi, I'm glad to hear everything is working properly now.

For speaker recognition, the ODAS creator from IntRoLab actually developed a system which does exactly that during is master degree. It takes separated and post-filtered audio as inputs then outputs the corresponding ids and confidence levels. However, it has been developed before ODAS so I'm not sure it will work right out of the box.

The project is open source and available for Matlab and Octave on the IntRoLab website. You'll also find an article describing the system inner working on the website.

benbiles commented 5 years ago

I found https://github.com/introlab/WISS and yes its in .m Matlab code. I managed to write a voice detection program based on WISS but still have to run the initial voice scan in Octave.

introlab / odas_web

record error #24

mic 1 mu = ( -0.0405, +0.0000, +0.0000

mic 2 mu = ( +0.0000, +0.0405, +0.0000 );