AddictedCS / soundfingerprinting

Open source audio fingerprinting in .NET. An efficient algorithm for acoustic fingerprinting written purely in C#.
https://emysound.com
MIT License
925 stars 187 forks source link

Using Fingerprints as start and end triggers for recording #222

Open SchroeerM opened 2 months ago

SchroeerM commented 2 months ago

Hey, I try to use the library to create multiple fingerprints from prerecorded files to find matches in an stream. This works well but I couldn't figure out how I can access the plain stream in the AVQueryResult for start or stop recording. Additionally it would be very interesting if you see a way how I could create a fifo-buffer of approx. 10 seconds to start the recording exactly where the fingerprint matches. Is there a way to use your solution for my needs or do I have to switch to an alternative? Thanks, Michael

AddictedCS commented 1 month ago

The following steps will help you with the problem you are trying to solve:

  1. You can instantiate RealtimeQueryCommand with an InterceptAVTrack that can capture AVTrack which is sent to the query command.
  2. AVTrack contains AudioSamples (float[]) which are the downsampled audio samples that were used to generate the fingerprints. You can save all the samples in a FIFO buffer, capture the audio that queries the data storage.
  3. Inside config.SuccessCallback, when the match finishes, you have the matches that can resolve the media that you've stored in the FIFO buffer.
  4. Resolve the media, and save it as 5512Hz Mono WAV file, generating playback.

A high level code sample:

 QueryCommandBuilder
                .Instance
                .BuildRealtimeQueryCommand()
                .From(source)
                .WithRealtimeQueryConfig(config => config.SuccessCallback = result => /*resolve FIFO buffer AVSamples*/)
                .InterceptAVTrack(avTrack => avTrack.Audio.Samples /*save av track into FIFO buffer*/)
                .UsingServices(modelService)
                .Query()

Hope it makes sense.

SchroeerM commented 1 month ago

Thanks for your answer. That means that I could only access the stream in 5512Hz mono quality? I reworked my software to the other way. Currently I detect silence in the stream and use the silence as start/stop trigger for recording. Every saved file I will analyse with my fingerprints and if theres a match then I will export the file to an different folder. The Stream I work on produces large empty silence lines between the interesting parts so that seemed to be the better way for me.