[Question] Hashing Audio Data?

H0tCh0colat3 commented 6 months ago

Is it possible to use this library to extract audio data and use it to compute a hash?

What I'm trying to accomplish

I'm building a personal app which tracks my music library. I need to be able to locate a music file if it's moved to a new location on disk. To that end, I was thinking I could calculate "hashes" for the songs and use them for comparison purposes. I think the best approach for this would be to extract the audio data from a file into a memory stream and use it to compute a hash. Is there some way I can use FFmpegSourceStream to get a byte array of only the audio data?

What I tried

var stream1 = new FFmpegSourceStream(fileInfo1);

var arrayLength = ??? //how should I get/calculate this value? Seems to be determined internally using private properties.

var holder = new byte[arrayLength];

var res = stream1.Read(holder, 0, arrayLength);

I'm not actually sure the Read method even returns what I'm looking for. Is this an appropriate usage?

Can I use a fingerprint?

It's my understanding that a fingerprint will match similar songs, however I'd like this to be an exact match, i.e., two different recordings of the same source, or two different releases of a song across two different albums should not be considered the same. Is it possible to set up the fingerprinting in such a way that only the exact same recording or version of a song produces a match?

Why can't I just hash the whole file?

A song's filename and metadata (artist, album, etc) may be modified, thus making a hash of the full file useless.

protyposis commented 6 months ago

Yes, this is possible.

Is there some way I can use FFmpegSourceStream to get a byte array of only the audio data?

You need to call Read multiple times in a loop, until it no longer returns data (a single read call will likely not return the whole audio, unless the file is very short). Example:

using (var stream = new FFmpegSourceStream(fileInfo))
using (var sha256 = SHA256.Create())
{
    var buffer = new byte[4096];
    int bytesRead;

    while ((bytesRead = stream.Read(buffer, 0, buffer.Length)) > 0)
    {
        sha256.TransformBlock(buffer, 0, bytesRead, null, 0);
    }

    sha256.TransformFinalBlock(buffer, 0, 0);

    var sha256HashBytes = sha256.Hash;
}

The full length of the stream is simply stream.Length. You still need to read the data in a loop, but there's a helper method which does it for you:

var rawAudioData = new byte[stream.Length];
StreamUtil.ForceRead(stream, rawAudioData, 0, rawAudioData.Length);

This approach of obtaining the length up front is not recommended when reading compressed audio formats, because the length in such cases might only be an estimate, and you may miss a few samples at the end.

Is it possible to set up the fingerprinting in such a way that only the exact same recording or version of a song produces a match?

That's how the implemented fingerprints behave. You'll get similar hashes if you compare the lossless version of a song with its lossy compressed counterpart (e.g., FLAC vs. MP3), or different lossy compressions (e.g., OGG vs. AAC), or if you compare an original file with a re-released "remastered version" (when it's just EQ'd or dynamically compressed). You'll likely get different hashes if it's an actual re-recording. Likewise, you'll get different hashes for different live recordings of the same song by the same band.

A song's filename and metadata (artist, album, etc) may be modified, thus making a hash of the full file useless.

Another advantage is also that audio files in different lossless formats will result in the same hash, e.g., a PCM wave and a FLAC file (actually, a FLAC fingerprint is exactly that - a MD5 hash of the uncompressed audio data).

H0tCh0colat3 commented 6 months ago

I see. Needing to call Read multiple times was the insight I was missing. Thank you for the incredibly quick response. This solves my issue.

protyposis / Aurio