evancohen / sonus

:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection
MIT License
618 stars 79 forks source link

handle arecord empty sound buffer loop after 2gig data #93

Closed sdetweil closed 5 years ago

sdetweil commented 5 years ago

fixes issue #68

evancohen commented 5 years ago

This all makes sense but it's going to take me a day to dig into it and test. Is there an easy way to trigger the empty sound buffer loop other than waiting for 2GB of data to be processed?

sdetweil commented 5 years ago

booo.. I HAD a debug version... use a flag to bypass waiting... use a timer to set the flag after 4-5 minutes..

I can rebuild and send to you

sdetweil commented 5 years ago

i am working on the actual time test now.. 14 hours in

evancohen commented 5 years ago

I never knew that the hard limit was 2GB, which got me thinking. Technically that's the max size supported for .wav. Two things could be happening here (guessing): The file is getting split in the background and for some reason the "new" file isn't streaming audio correctly OR we're hitting the 2GB limit and the file isn't getting split so we're just keep getting the end of the file in the stream.

That means that there might be an elegant solution which doesn't require us to restart Sonus!

There's two ways that I can think of to simulate this arecord behavior for testing:

1) Use the --max-file-time flag (by setting it to 30 seconds, for example). 2) Use -i for interactive and send "SIGUSR1", which will close the output file, open a new one, and continue recording.

I'm hoping that one of these will either make it super easy to replicate the behavior or solve the problem completely.

sdetweil commented 5 years ago

Ok, so the problem is in the capture routine here... https://github.com/bear24rw/alsa-utils/blob/master/aplay/aplay.c

do {
        /* open a file to write */
        if(!tostdout) {
            /* upon the second file we start the numbering scheme */
            if (filecount) {
                filecount = new_capture_file(orig_name, namebuf,
                                 sizeof(namebuf),
                                 filecount);
                name = namebuf;
            }

            /* open a new file */
            remove(name);
            if ((fd = open64(name, O_WRONLY | O_CREAT, 0644)) == -1) {
                perror(name);
                exit(EXIT_FAILURE);
            }
            filecount++;
        }

        rest = count;
        if (rest > fmt_rec_table[file_type].max_filesize)
            rest = fmt_rec_table[file_type].max_filesize;

        /* setup sample header */
        if (fmt_rec_table[file_type].start)
            fmt_rec_table[file_type].start(fd, rest);

        /* capture */
        fdcount = 0;
        while (rest > 0 && capture_stop == 0) {
            size_t c = (rest <= (off64_t)chunk_bytes) ?
                (size_t)rest : chunk_bytes;
            size_t f = c * 8 / bits_per_frame;
            if (pcm_read(audiobuf, f) != f)
                break;
            if (write(fd, audiobuf, c) != c) {
                perror(name);
                exit(EXIT_FAILURE);
            }
            count -= c;
            rest -= c;
            fdcount += c;
        }

        /* finish sample container */
        if (fmt_rec_table[file_type].end && !tostdout) {
            fmt_rec_table[file_type].end(fd);
            fd = -1;
        }

        /* repeat the loop when format is raw without timelimit or
         * requested counts of data are recorded
         */
    } while ( ((file_type == FORMAT_RAW && !timelimit) || count > 0) &&
        capture_stop == 0);

basically, while streaming to stdout (filename = "-") , it doesn't open/close files.. it just sends data. it creates header for block, adds data (if room in file), sets trailer for block (4000 bytes), sends block, but at the size limit, whatever it is, there is no room for data.. so it sends an empty data block. NOT SILENCE... sound with no data, and its a loop...

if u set the size smaller as I read the code, it will end recording, and arecord will die..

I don't think sig will work either as I read the code,, the fix I proposed was to NOT decrement if streaming to stdout

if(tostdout) {
            count -= c;
            rest -= c;
}

signal handler is here
static void signal_handler(int sig)

looks like it only uses filesize for NOT streaming
/* write to stdout? */
if (!name || **!strcmp(name, "-")**) {
    fd = fileno(stdout);
    name = "stdout";
    tostdout=1;
    if (count > fmt_rec_table[file_type].max_filesize)
        count = fmt_rec_table[file_type].max_filesize;
}


looks like -d interrupt_after_n_seconds parm will break the loop, (-d sets the time_limit variable but..
arecord will die.. stdout will close, then we have the same recovery to do...

but I suppose you could set a 12 hour time limit, let it die, capture that record.on('end'....)
restart it as I do.. and only not have the funky consecutive empty sound buffer detector
but have to watch out for  sigterm ending arecord  etc.. 

I thought mine was the safest... only recovers if we have trouble
sdetweil commented 5 years ago

arecordhelper does it better