xiph / opus

Modern audio compression for the internet.
https://opus-codec.org/
Other
2.35k stars 620 forks source link

Pop/click at end of stream, last samples get changed from 0 to audible #240

Open GregSlazinski opened 2 years ago

GregSlazinski commented 2 years ago

When compressing this sound: sound.zip which is quiet at the end, it gets noise after being converted to OPUS.

WAV ends exactly at 0, but generated OPUS goes below. I've used 96kbps encoding.

How to fix the problem?

Here's the original WAV wav

After compressing to OPUS opus

silverbacknet commented 2 years ago

Opus doesn't guarantee starting or ending on exactly zero, nor does any major codec, though testing shows Opus is closer than AAC in this case. What it does guarantee, unlike most codecs, is that if you concatenate and continuously play two Opus streams, there will never be a click, garbage, or other discontinuity.

If you're setting the audio stream back to zero immediately after playing (which is the usual effect of closing the audio interface, even if you don't explicitly do it), or closing and reopening the Opus context for the next segment, then you have to create a short cross-fade filter yourself. It would be a good idea to put it at the beginning of each stream, too. You only need to touch a handful of samples to prevent clicks or pops.

GregSlazinski commented 2 years ago

Hi Emily, thanks for your reply.

I believe this is a major problem.

I never had an issue like that when using Vorbis. Even the sound from my first post and other that have similar problem I've converted to Vorbis instead and work good.

then you have to create a short cross-fade filter yourself

No software does this. And that would be additional performance issue/overhead for applications, in my case I'm a game developer making a game which already use full CPU and GPU power.

And another problem, many times people make audio files that are supposed to be looped, either ambient sounds or music. So if you want to play that sound in looped mode, then you need to have the first samples and last samples match exactly too avoid any pop clicks.

Can we add an encoder setting, that would allow forcing either start of the audio frame, end of stream, or both to match exactly the source data? This can come at the cost of slight loss of quality, I'm OK with that, as it would only affect few ms of start and end of song. This encoder setting would then be used by users only when processing the first frame, then disabled for frames in the middle, and turned on for the last frame. Example: Force_frame_start_to_match_source Force_frame_end_to_match_source

That would be the best way to solve the problem. There wouldn't need to be any wasted extra processing during audio playback because audio files would be good already.

xnorpx commented 2 years ago

I can't really hear it in the samples and your picture is not clear, can you paste a screenshot from audacity and then mark the section?

GregSlazinski commented 2 years ago

What exactly in my picture is not clear? I don't understand. The problem happens at the end of audio, the end is at the right side. You can see the WAV picture, audio ends on the right side exactly at the center. But generated OPUS ends lower, not at center. If you can't hear, you have to up the volume in your system, after sound ends, there's a pop.

GregSlazinski commented 2 years ago

The audio difference at the end is very big, so maybe it's even some bug in the encoder. I would appreciate if some opus developer, could investigate this. Thanks. The problem occurs even when encoding at 512 kbps, that should be super high quality. So that even more suggets it's some kind of bug in opus.

silverbacknet commented 2 years ago

I confirmed the issue myself before my first comment, it's definitely there. The difference is small but in audio small is audible, amplified enough. Actually, I extended the audio by another 40ms of pure silence and encoded it, and while the audio continued trending to zero it didn't reach it.

It's not explicitly a bug in the encoder, though. That's how the format's designed to work, since it's designed to chain to more Opus frames at the end rather than being cut off to zero... although it would certainly be nice if the exact DC was targeted for silent frames, especially zero DC. Most software that needs it does crossfade it to zero.

I recommended handling the final few samples as a workaround since the codec has reached maturity and is rarely being updated anymore.

jmvalin commented 2 years ago

The issue is likely due to the Opus DC rejection filter. If the signal has positive DC component, then the encoder would be subtracting a constant to the signal to make it zero-mean. And it just so happens to your signal gets back to exactly zero, which after the subtraction becomes negative.

GregSlazinski commented 2 years ago

But it causes artifacts.. How to disable this? Like this, I can't use Opus, but have to go back to Vorbis for affected files.