zmwangx / rust-ffmpeg

Safe FFmpeg wrapper.
Do What The F*ck You Want To Public License
1.35k stars 207 forks source link

problem with dump-frames.rs for certain videos (obviously caused by the frame size) #64

Open goto40 opened 3 years ago

goto40 commented 3 years ago

Problem

When executing dump-frames.rs I observed a strange effect: the output of the program is scrambled in some way (possibly the stride of the output is not OK). It seems to depend on the frame size of the video:

I observed the same effect for other video files.

Version

dceddia commented 3 years ago

I think this is related to my audio problem in #72. I added a few prints to the dump-frames example and for those videos I see:

for 128x128:

ffmpeg-dump-frames/frames_128x128 % ../target/debug/ffmpeg-dump-frames /Users/dceddia/Downloads/128x128.mp4
frame 0: 128x128, stride 384, plane_height 128, plane_width 128
frame 1: 128x128, stride 384, plane_height 128, plane_width 128
frame 2: 128x128, stride 384, plane_height 128, plane_width 128
...

for 120x120:

ffmpeg-dump-frames/frames_120x120 % ../target/debug/ffmpeg-dump-frames /Users/dceddia/Downloads/120x120.mp4
frame 0: 120x120, stride 384, plane_height 120, plane_width 120
frame 1: 120x120, stride 384, plane_height 120, plane_width 120
frame 2: 120x120, stride 384, plane_height 120, plane_width 120
...

Looking at the code for frame.data(0), it multiplies stride by plane_height:

    pub fn data(&self, index: usize) -> &[u8] {
        if index >= self.planes() {
            panic!("out of bounds");
        }

        unsafe {
            slice::from_raw_parts(
                (*self.as_ptr()).data[index],
                self.stride(index) * self.plane_height(index) as usize,
            )
        }
    }

But that's coming out wrong of course, because both files show a stride of 384 (guessing we've got 3 bytes per pixel because of RGB24, that's 128 pixels, obvs only true for one of those files!). stride is implemented in terms of the frame's linesize, and ffmpeg docs have this important note about the linesize property...

The linesize may be larger than the size of usable data – there may be extra padding present for performance reasons.

So I think, similar to with my audio issue, the linesize shouldn't be trusted.

But I'm confused because I tried writing a function to read the data differently, and the images come out identical afaict. Hardcoded the 3 for science. Didn't change the output, which makes me think maybe I missed something...

fn frame_data(frame: &Video) -> &[u8] {
    unsafe {
        std::slice::from_raw_parts(
            (*frame.as_ptr()).data[0],
            3 * frame.plane_width(0) as usize * frame.plane_height(0) as usize,
        )
    }
}
jrpelkonen commented 3 years ago

@dceddia It seems to me that due to the padding you mentioned in your comment, you need to discard the excess data per each line. I modified the sample to test this out, and it fixed the issue in my test case:

I changed this line:

    file.write_all(frame.data(0))?;

to:

    let data = frame.data(0);
    let stride = frame.stride(0);
    let byte_width: usize = 3 * frame.width() as usize;
    let height: usize = frame.height() as usize;
    for line in 0..height {
        let begin = line * stride;
        let end = begin + byte_width;
        file.write_all(&data[begin..end])?;
    }
dceddia commented 3 years ago

Ahh, brilliant, thanks @jrpelkonen. I tested it out and it works for me too. So the width is contained within the stride, I guess.

bendoerry commented 2 years ago

@jrpelkonen, is there any reason this would break other videos, or should it work in all cases?

In that case, would it make sense opening a PR fixing this example?

jrpelkonen commented 2 years ago

@bendoerry It's been a while, and I am not in a position to validate this right now, but I think at least one issue in the modification above is that it assumes each pixel to be 3 bytes in size.

jonatino commented 11 months ago

I have this issue as well with some mp4s but not all. For me it cuts the frame diagonally across the screen from top left to bottom right, then flips the two sides so the bottom left triangle image is actually in the top right and vise versa. I'm using ffmpeg v6 (which extracts the frames as expected) and ffmpeg-next which does not.

jonatino commented 11 months ago

@jrpelkonen solution above will work well for most video formats in YUV colorspace, but will break with video formats such as webm which are RGBA color spaces (4 bytes per pixel).