pdeljanov / Symphonia

Pure Rust multimedia format demuxing, tag reading, and audio decoding library
Mozilla Public License 2.0
2.39k stars 140 forks source link

mp3 playing twice #51

Closed probablykasper closed 3 years ago

probablykasper commented 3 years ago

I have a few mp3 files like this, where it plays the file and then plays it again. The file is 2:17, but it's played for about 4:34: Kill Paris feat. Big Gigantic & Jimi Tents - Fizzy Lifting Drink.mp3.zip

I'm guessing there's something about how it's encoded that causes Symphonia to trip up, but no idea what that could be.

Cargo.toml

[package]
name = "mp3test"
version = "0.1.0"
edition = "2018"

[dependencies]
rodio = { git = "https://github.com/RustAudio/rodio", rev = "d40551d", features = ["symphonia-mp3"] }

main.rs

use rodio::{Decoder, OutputStream, Sink};
use std::fs::File;
use std::io::BufReader;
use std::sync::mpsc::channel;
use std::sync::mpsc::TryRecvError;
use std::thread;
use std::time::{Duration, Instant};

fn main() {
    let path = "Kill Paris feat. Big Gigantic & Jimi Tents - Fizzy Lifting Drink.mp3";
    let file = File::open(path).unwrap();
    let buf = BufReader::new(file);

    let decoder = Decoder::new(buf).unwrap();

    let output_stream = OutputStream::try_default();
    let (_stream, handle) = output_stream.unwrap();
    let sink = Sink::try_new(&handle).unwrap();

    sink.append(decoder);
    let (send, recv) = channel();
    thread::spawn(move || {
        sink.set_volume(0.25);
        let start = Instant::now();
        sink.play();
        sink.sleep_until_end();
        let dur = start.elapsed().as_secs_f64();
        println!("Ended, took {:.3}s", dur);
        send.send("End").unwrap();
    });

    let start = Instant::now();
    loop {
        match recv.try_recv() {
            Ok("End") => break,
            Err(TryRecvError::Empty) => {}
            _ => panic!(),
        };
        let dur = start.elapsed().as_secs_f64();
        println!("\u{23f1}  {:.3}s", dur);
        thread::sleep(Duration::from_millis(1000));
    }
}
pdeljanov commented 3 years ago

Given the file size of 12.1 MB and the bitrate of 320kbps, the duration should be around 5:16 (give or take considering tag overhead). Indeed, this is what ffplay reports, and it also plays the audio twice over as well. Same for VLC and MPV.

So I don't think this is a bug with Symphonia, but maybe we can handle this case better since atleast ffmpeg warns about it.

probablykasper commented 3 years ago

@pdeljanov Hmm weird. the ffplay warning is useful:

[mp3 @ 0x7fd3a4822600] invalid concatenated file detected - using bitrate for duration
[mp3 @ 0x7fd3a4822600] Estimating duration from bitrate, this may be inaccurate

QuickTime Player just plays 2:18, which is why I assumed that's the correct behavior. I'm seeing that Chrome too shows 5:16. I'm pretty sure Symphonia played 4:34 though, which is inconsistent with both QuickTime Player and ffplay/Chrome.

Definitely curious how to go about detecting & fixing this issue for all my files tho 🤔 I wonder what's caused it

pdeljanov commented 3 years ago

Yeah, Symphonia plays 4:34 of audio, which is actually the total length of the audio. If I try to seek (or listen) past that point in other players it just closes the file immediately. It makes sense though, 5:16 is just a guess after all.

Now, I wonder how ffmpeg detects it. I'm guessing QuickTime also detects the issue but cuts the file off at 2:18.

probablykasper commented 3 years ago

@pdeljanov Fair enough. Any idea if handling it like QuickTime would be a good or bad idea?

pdeljanov commented 3 years ago

:thinking: Being able to concatenate two MP3 streams together to make a new one is kinda a "feature" of MP3 (basically how internet radio works), so I probably wouldn't want to default to the QuickTime behaviour. There's also probably some wisdom as to why VLC, MPV, and FFMpeg don't do it either. So, perhaps behind an option (a "strict" mode)?

probablykasper commented 3 years ago

@pdeljanov In this case it's an invalid concatenated file though (according to ffplay)

I think it's probably not worth having a strict mode option for this

pdeljanov commented 3 years ago

Okay, I think I figured it out. The file is a legitimate and legal MP3, however, extra (non-standard) headers (Xing, LAME headers) were added to it that specify the 2:17 duration. These headers are what ffmpeg is using to detect the mismatch.

Symphonia doesn't have support for these headers yet since they aren't part of the MPEG audio standard. They are more like defacto standards that were invented over time to try to workaround the deficiencies of the MP3 format.

It's still a question what the right thing to do though because a compliant MP3 decoder will decode the file fully and ignore the information in those headers. Symphonia could parse the information and make it available as metadata or a cue, but it'd be non-compliant behaviour to use it to end the MP3 prematurely. I can appreciate that for some applications it'd be preferable to actually use the data in these headers, so maybe an obey_extra_headers (not sure what to name it) option in FormatOptions would do the trick.

probablykasper commented 3 years ago

That makes sense. Do you think ffplay says it's an invalid concatenation because the Xing/LAME header doesn't match it?

Symphonia doesn't have support for these headers yet since they aren't part of the MPEG audio standard. They are more like defacto standards that were invented over time to try to workaround the deficiencies of the MP3 format.

Ah, didn't know that. Isn't Xing/LAME something important you need to pay attention to for certain mp3s?

I can appreciate that for some applications it'd be preferable to actually use the data in these headers, so maybe an obey_extra_headers (not sure what to name it) option in FormatOptions would do the trick.

I think it's more likely you'd have an incorrect header than incorrect audio data, so imo obey_extra_headers wouldn't be necessary unless other people request it for a good reason. I made a script to scan a file or directory for ffprobe warnings and figured out that out of 5663 mp3s, only 15 have this issue, so I'll just fix those manually

probablykasper commented 3 years ago

If I try to seek (or listen) past that point in other players it just closes the file immediately

I tried playing the file in Chrome and immediately seek to around 5:10 which results in the song playing from the start again, continuing way past the 5:16 duration it initially reports. If the actual duration of the audio is 4:34 and Chrome plays past that, I wonder how Symphonia handles that

pdeljanov commented 3 years ago

Just a little background, but I think it'll make things more clear.

MP3 is a really simple format. It basically chunks the audio up into 576 or 1152 sample chunks, compresses each chunk, and then writes out the compressed chunks to the file in order. There is no global header with metadata or length information, and likewise, each chunk contains no timing information (where it is in the file).

Therefore, players can only infer the length by dividing the size of the file by the bitrate, or it can scan through the entire file and count the number of chunks. The former method only works well when the file is encoded as a constant bitrate, otherwise, if it's a variable bitrate file, the player needs to determine the average bitrate and use that.

The Xing and LAME tags try to solve this problem (an others) by adding that missing information to the file. A more well known example of these kinds of supplementary tags are ID3v2 for metadata. These tags are added to the front of a file and encoded in such a way a MP3 decoder will ignore them.

So as you can see, there really is no "one true" way to determine the length of a MP3 file and different applications will do it differently. This is why you're seeing so many different behaviors across apps.

That makes sense. Do you think ffplay says it's an invalid concatenation because the Xing/LAME header doesn't match it?

Symphonia doesn't have support for these headers yet since they aren't part of the MPEG audio standard. They are more like defacto standards that were invented over time to try to workaround the deficiencies of the MP3 format.

Yep, I checked the source code and ffmpeg detects the presence of the LAME and Xing tags and compares the file size stated in the tag vs. the actual file size and prints that message if they differ.

Ah, didn't know that. Isn't Xing/LAME something important you need to pay attention to for certain mp3s?

Definitely useful for VBR files and supporting gapless playback. Therefore I think Symphonia will need to support these in the future.

I think it's more likely you'd have an incorrect header than incorrect audio data, so imo obey_extra_headers wouldn't be necessary unless other people request it for a good reason. I made a script to scan a file or directory for ffprobe warnings and figured out that out of 5663 mp3s, only 15 have this issue, so I'll just fix those manually

Sounds good.

probablykasper commented 3 years ago

I understand, thanks for explaining!

Definitely useful for VBR files and supporting gapless playback. Therefore I think Symphonia will need to support these in the future.

Gapless playback would definitely be sweet