xiph / flac

Free Lossless Audio Codec
https://xiph.org/flac/
GNU Free Documentation License v1.3
1.63k stars 277 forks source link

Seeking Multichannel Ogg FLAC Files Does Not Handle Run-Length Encoding Correctly. #262

Closed smith3390s closed 2 years ago

smith3390s commented 2 years ago

I am working on a recording/playback system that is designed to encode and decode Vorbis, SPEEX, and FLAC. This is a multichannel system - typically 8 channels but could be up to 24 or 32. The system relies on the Xiph.org libraries for the Ogg container services and the encode/decode libraries. I am using the latest release of libFLAC – 1.3.2. At this time, recording and playback is working great for SPEEX.

The use-case for this system is to record speech from one or more microphones and to log an event when a microphone becomes active. Usually, only one microphone is active at a time but there are rare occasions when speakers interrupt each other. The event log is then available during playback to instantly seek to a specific speaker event. This works as expected when the encoding is SPEEX.

When playing a FLAC encoded file, seeking results in what sounds like unexpected 'dead' time, 'lost' or 'dropped' audio sections, or 'overlapping' audio. My guess is that since there is typically only one channel with active audio at a time, FLAC encodes using run-length encoding for the 'empty' channels and this results in some irregular frames in the data stream. And I'm guessing that the seek methods are not designed to handle this. As an experiment, I created a test recording with all channels always active – sections of speech interspersed with ‘silence’ in the form of a very low level noise signal. This created a multi-channel, multi-event recording that I could then play back and seeking worked correctly.

I recently pulled the latest FLAC source and rebuilt the libraries. I still encounter the same seeking issues when sections of channel audio are encoded using run-length encoding.

ktmf01 commented 2 years ago

Without any examples this will be rather hard to debug.

When playing a FLAC encoded file, seeking results in what sounds like unexpected 'dead' time, 'lost' or 'dropped' audio sections, or 'overlapping' audio.

What do you mean by this? The audio playback is distorted? What player are you using?

I recently pulled the latest FLAC source and rebuilt the libraries. I still encounter the same seeking issues when sections of channel audio are encoded using run-length encoding.

With what player are you using these FLAC sources?

In other words: how can this be reproduced?

wader commented 2 years ago

"When playing a FLAC encoded file" is that FLAC in Ogg or "raw" FLAC? i've experienced issues with ffmpeg in some very rare cases where quiet samples (lots of unary encoded zeroes results in lots of bits set) ends up looking like FLAC sync headers.

And agree with @ktmf01 some way or reproducing would be nice

smith3390s commented 2 years ago

@ktmf01 @wader re Reproducibility: I agree. This will be very important unless I'm to solve this myself. I am developing my own multichannel Ogg FLAC recorder and player. I use the libFishsound API because I also support SPEEX and VORBIS codecs. Unfortunately, I don't know of any open source multichannel players. Please let me know otherwise. I spent some time yesterday recording a 2 channel example that can be used to demonstrate the problem with VLC. I have this working and can share if you think that would help.

Here I will describe the process I used to create the 2 channel example. Test File 1 - Sections of audio interspersed with sections of digital 'silence'.

  1. In Reaper or some other DAW create a project with 2 audio tracks.
  2. Import 2-second duration wav files of varying audio. I used 4 different 2-second duration sine-waves each of a different frequency (A-440, 880, etc.). You could also use unique snippets of speech or music. You just need something that will allow you to anticipate what SHOULD be playing back and WHEN during seek operations.
  3. Place these 2-second snippets so that each channel alternates as being 'active'. In other words: @0 seconds leave 2 seconds of silence in channel 1 and channel 2; @2 seconds insert A-440 snippet in channel 1 and leave 2 seconds of silence in channel 2; @4 seconds leave 2 seconds of silence in channel 1 and insert A-440 snippet in channel 2; @6 seconds insert A-880 snippet in channel 1 and leave 2 seconds of silence in channel 2 ... and so-on ... to produce 30 seconds of playable 2-channel audio.
  4. Encode this audio to an Ogg FLAC file. Each channel will be encoded as a separate mono stream.
  5. Open the resultant file in VLC. Only one channel will play back because the streams are mono encoded. Notice that the play head is non-responsive during periods of silence and that seeking into these periods of silence is unpredictable.

Test File 2 - Sections of audio interspersed with sections of very low-level, high-frequency signal (noise will do).

  1. In Reaper or some other DAW create a project with 2 audio tracks.
  2. Import 2-second duration wav files of varying audio. I used 4 different 2-second duration sine-waves each of a different frequency (A-440, 880, etc.). You could also use unique snippets of speech or music. You just need something that will allow you to anticipate what SHOULD be playing back and WHEN during seek operations. For this example I am importing 2 seconds of white noise @-100dB for the 'silence' sections.
  3. Place these 2-second snippets so that each channel alternates as being 'active'. In other words: @0 seconds insert 2 seconds of 'silent noise' in channel 1 and channel 2; @2 seconds insert A-440 snippet in channel 1 and insert 2 seconds of 'silent noise' in channel 2; @4 seconds insert 2 seconds of 'silent noise' in channel 1 and insert A-440 snippet in channel 2; @6 seconds insert A-880 snippet in channel 1 and insert 2 seconds of 'silent noise' in channel 2 ... and so-on ... to produce 30 seconds of playable 2-channel audio.
  4. Encode this audio to an Ogg FLAC file. Each channel will be encoded as a separate mono stream.
  5. Open the resultant file in VLC. Only one channel will play back because the streams are mono encoded. Notice that the play head behaves normally and accurate seeking to any point in the duration is possible.
ktmf01 commented 2 years ago

Here are two copyright free files. One with the problem one without it

I'm pretty sure this is a player problem that is not linked to libFLAC. For example, FLACwithsilence.flac has seeking problems in VLC, Windows Media Player and Firefox, but not in Chrome, foobar2000 or my Android phone. Android uses libFLAC directly, so it doesn't seem to be a problem with libFLAC. So, this should be a bug report with VLC I think, unless you have the problem through libfishsound as well, but then I'd need more details.

Please be aware that libfishsound hasn't been updated for 12 years now. It might not be the best basis for writing new software.

Finally, I'd like to note that this issue is closely linked to #90. Solutions mentioned there are applicable here too: the problems disappear when a FLAC file has no CONSTANT subframes.

ktmf01 commented 2 years ago

@smith3390s Would you mind compiling the code under #264 and trying it for your use case?

smith3390s commented 2 years ago

@ktmf01 I have had time to do some quick tests and it looks promising. I'll do some more rigorous testing tomorrow and let you know the results.

smith3390s commented 2 years ago

@ktmf01 I have reviewed your code changes under #264, pulled the code, built and have been testing through my use cases. The reasoning behind the changes makes sense and it does appear to solve the issue.