abhiTronix / deffcode

A cross-platform High-performance FFmpeg based Real-time Video Frames Decoder in Pure Python 🎞️⚡
https://abhitronix.github.io/deffcode
Apache License 2.0
178 stars 3 forks source link

[Bug]: iPhone Portrait Oriented Videos are Squashed to Landscape #39

Closed philipqueen closed 2 months ago

philipqueen commented 1 year ago

Description

Trying to decode a portrait iPhone video results in the video being squashed into landscape. The shape is reversed (width and height values switched), and the video content is stretched horizontally.

This behavior occurs with the native MOV output and persists with conversion to mp4.

If it helps, the behavior matches this bug from MoviePy: https://github.com/Zulko/moviepy/issues/1911 I found this library searching for a MoviePy alternative to circumvent this exact issue, so I would appreciate help in either squashing the bug or figuring out a good workaround.

Issue Checklist

Expected behaviour

The video is loaded in portrait mode, matching the shape in the metadata. Playing in quicktime, my example looks like this:

Screen Shot 2023-04-17 at 6 37 42 PM

Actual behaviour

The video is squashed into landscape mode, and the shape is reversed from the metadata. Visually it looks like this:

Screen Shot 2023-04-17 at 6 37 24 PM

And in the terminal output, the difference in shape is apparent here:

Screen Shot 2023-04-17 at 6 28 13 PM

Steps to reproduce

  1. Record a portrait video on an iPhone
  2. Use FFdecoder class to decode the video and observe the shape of the video

Terminal log output

18:34:24 ::   Utilities   ::   INFO   :: Running DeFFcode Version: 0.2.5
18:34:24 ::   FFhelper    ::  DEBUG   :: Final FFmpeg Path: ffmpeg
18:34:24 ::   FFhelper    ::  DEBUG   :: FFmpeg validity Test Passed!
18:34:24 ::   FFhelper    ::  DEBUG   :: Found valid FFmpeg Version: `b'5.1.2'` installed on this system
18:34:24 ::    Sourcer    ::  DEBUG   :: Found valid FFmpeg executable: `ffmpeg`.
18:34:24 ::    Sourcer    ::  DEBUG   :: Extracting Metadata...
18:34:24 ::    Sourcer    ::  DEBUG   :: Metadata Extraction completed successfully!
18:34:24 ::   FFdecoder   ::   INFO   :: Using default `rgb24` pixel-format for this pipeline.
18:34:24 ::   FFdecoder   ::   INFO   :: Default source resolution will be used for this pipeline for defining output resolution.
18:34:24 ::   FFdecoder   ::   INFO   :: Default source framerate will be used for this pipeline for defining output framerate.
18:34:24 ::   FFdecoder   :: CRITICAL :: Activating Video-Only Mode of Operation.
18:34:24 ::   FFdecoder   ::  DEBUG   :: Executing FFmpeg command: `ffmpeg -vcodec hevc -i /Users/username/Documents/iPhoneTesting/RawVideos/Cam0.mp4 -pix_fmt rgb24 -s 1920x1080 -framerate 29.97 -f rawvideo -`
ffmpeg version 5.1.2 Copyright (c) 2000-2022 the FFmpeg developers
  built with Apple clang version 14.0.0 (clang-1400.0.29.102)
  configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/5.1.2 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-neon
  libavutil      57. 28.100 / 57. 28.100
  libavcodec     59. 37.100 / 59. 37.100
  libavformat    59. 27.100 / 59. 27.100
  libavdevice    59.  7.100 / 59.  7.100
  libavfilter     8. 44.100 /  8. 44.100
  libswscale      6.  7.100 /  6.  7.100
  libswresample   4.  7.100 /  4.  7.100
  libpostproc    56.  6.100 / 56.  6.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/username/Documents/iPhoneTesting/RawVideos/Cam0.mp4':
  Metadata:
    major_brand     : qt  
    minor_version   : 0
    compatible_brands: qt  
    creation_time   : 2023-02-07T22:32:21.000000Z
    com.apple.quicktime.make: Apple
    com.apple.quicktime.model: iPhone 7
    com.apple.quicktime.software: 14.4.2
    com.apple.quicktime.creationdate: 2023-02-07T15:32:21-0700
  Duration: 00:00:31.83, start: 0.000000, bitrate: 7822 kb/s
  Stream #0:0[0x1](und): Video: hevc (Main) (hvc1 / 0x31637668), yuv420p(tv, bt709), 1920x1080, 7680 kb/s, 29.97 fps, 29.97 tbr, 600 tbn (default)
    Metadata:
      creation_time   : 2023-02-07T22:32:21.000000Z
      handler_name    : Core Media Video
      vendor_id       : [0][0][0][0]
      encoder         : HEVC
    Side data:
      displaymatrix: rotation of -90.00 degrees
  Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 89 kb/s (default)
    Metadata:
      creation_time   : 2023-02-07T22:32:21.000000Z
      handler_name    : Core Media Audio
      vendor_id       : [0][0][0][0]
  Stream #0:2[0x3](und): Data: none (mebx / 0x7862656D), 2 kb/s (default)
    Metadata:
      creation_time   : 2023-02-07T22:32:21.000000Z
      handler_name    : Core Media Metadata
  Stream #0:3[0x4](und): Data: none (mebx / 0x7862656D), 0 kb/s (default)
    Metadata:
      creation_time   : 2023-02-07T22:32:21.000000Z
      handler_name    : Core Media Metadata
  Stream #0:4[0x5](und): Data: none (mebx / 0x7862656D), 34 kb/s (default)
    Metadata:
      creation_time   : 2023-02-07T22:32:21.000000Z
      handler_name    : Core Media Metadata
Stream mapping:
  Stream #0:0 -> #0:0 (hevc (native) -> rawvideo (native))
Press [q] to stop, [?] for help
Output #0, rawvideo, to 'pipe:':
  Metadata:
    major_brand     : qt  
    minor_version   : 0
    compatible_brands: qt  
    com.apple.quicktime.creationdate: 2023-02-07T15:32:21-0700
    com.apple.quicktime.make: Apple
    com.apple.quicktime.model: iPhone 7
    com.apple.quicktime.software: 14.4.2
    encoder         : Lavf59.27.100
  Stream #0:0(und): Video: rawvideo (RGB[24] / 0x18424752), rgb24(pc, gbr/bt709/bt709, progressive), 1920x1080, q=2-31, 1491500 kb/s, 29.97 fps, 29.97 tbn (default)
    Metadata:
      creation_time   : 2023-02-07T22:32:21.000000Z
      handler_name    : Core Media Video
      vendor_id       : [0][0][0][0]
      encoder         : Lavc59.37.100 rawvideo
    Side data:
      displaymatrix: rotation of -0.00 degrees

Python Code(Optional)

from deffcode import FFdecoder, Sourcer
import cv2

def read_video_deffcode(file_pathstring: str) -> None:
    decoder = FFdecoder(str(file_pathstring)).formulate()

    frame_count = 0

    for frame in decoder.generateFrame():
        print(f"reading frame {frame_count}")
        if frame is None:
            print(f"could not read frame {frame_count}")
            break

        print(frame.shape)

        frame_bgr = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        cv2.imshow("Output", frame_bgr)

        # check for 'q' key if pressed
        key = cv2.waitKey(1) & 0xFF
        if key == ord("q"):
            break

        frame_count += 1

    cv2.destroyAllWindows()

    decoder.terminate()

def get_metadata_from_video_deffcode(file_pathstring: str) -> dict: 
    sourcer = Sourcer(file_pathstring).probe_stream()

    metadata_dict = sourcer.retrieve_metadata()

    print(metadata_dict)
    return metadata_dict

if __name__ == "__main__":
    input_video_pathstring = "VIDEO/PATH"

    read_video_deffcode(file_pathstring=input_video_pathstring)
    get_metadata_from_video_deffcode(file_pathstring=input_video_pathstring)

DeFFcode Version

0.2.5

Python version

Python 3.9.16

Operating System version

MacOS Monterey 12.6.1

Any other Relevant Information?

No response

abhiTronix commented 1 year ago

@philipqueen Thanks for reporting this bug. Could you provide terminal outputs by executing following two codes: (Make sure to use your portrait video as source)

Code-1:

# import the necessary packages
from deffcode import FFdecoder

# initialize and formulate the decoder using your portrait video
decoder = FFdecoder("portrait_video.mp4", verbose=True).formulate()

# print metadata as `json.dump`
print(decoder.metadata)

# terminate the decoder
decoder.terminate()

Code-2:

# import the necessary packages
from deffcode import FFdecoder

# disable resolution
ffparams = {"-custom_resolution":"null"}

# initialize and formulate the decoder using your portrait video
decoder = FFdecoder("portrait_video.mp4", verbose=True, **ffparams).formulate()

# print metadata as `json.dump`
print(decoder.metadata)

# terminate the decoder
decoder.terminate()
philipqueen commented 1 year ago

Code-1 Terminal Output:

10:08:56 ::   Utilities   ::   INFO   :: Running DeFFcode Version: 0.2.5
10:08:56 ::   FFhelper    ::  DEBUG   :: Final FFmpeg Path: ffmpeg
10:08:56 ::   FFhelper    ::  DEBUG   :: FFmpeg validity Test Passed!
10:08:56 ::   FFhelper    ::  DEBUG   :: Found valid FFmpeg Version: `b'5.1.2'` installed on this system
10:08:56 ::    Sourcer    ::  DEBUG   :: Found valid FFmpeg executable: `ffmpeg`.
10:08:56 ::    Sourcer    ::  DEBUG   :: Extracting Metadata...
10:08:56 ::    Sourcer    ::  DEBUG   :: Metadata Extraction completed successfully!
10:08:57 ::   FFdecoder   ::   INFO   :: Using default `rgb24` pixel-format for this pipeline.
10:08:57 ::   FFdecoder   ::   INFO   :: Default source resolution will be used for this pipeline for defining output resolution.
10:08:57 ::   FFdecoder   ::   INFO   :: Default source framerate will be used for this pipeline for defining output framerate.
10:08:57 ::   FFdecoder   :: CRITICAL :: Activating Video-Only Mode of Operation.
10:08:57 ::   FFdecoder   ::  DEBUG   :: Executing FFmpeg command: `ffmpeg -vcodec hevc -i /Users/username/Downloads/portrait.MOV -pix_fmt rgb24 -s 1920x1080 -framerate 30.0 -f rawvideo -`
{
  "ffmpeg_binary_path": "ffmpeg",
  "source": "/Users/username/Downloads/portrait.MOV",
  "source_extension": ".MOV",
  "source_video_resolution": [
    1920,
    1080
  ],
  "source_video_pixfmt": "yuv420p",
  "source_video_framerate": 30.0,
  "source_video_decoder": "hevc",
  "source_duration_sec": 4.37,
  "approx_video_nframes": 131,
  "source_video_bitrate": "8140k",
  "source_audio_bitrate": "160k",
  "source_audio_samplerate": "44100 Hz",
  "source_has_video": true,
  "source_has_audio": true,
  "source_has_image_sequence": false,
  "source_demuxer": "",
  "output_frames_resolution": [
    1920,
    1080
  ],
  "output_frames_pixfmt": "yuv420p",
  "output_framerate": 30.0,
  "ffdecoder_operational_mode": "Video-Only"
}
10:08:57 ::   FFdecoder   ::  DEBUG   :: Terminating FFdecoder Pipeline...
10:08:57 ::   FFdecoder   ::   INFO   :: Pipeline terminated successfully.

Code-2 Terminal Output:

10:11:11 ::   Utilities   ::   INFO   :: Running DeFFcode Version: 0.2.5
10:11:11 ::   FFhelper    ::  DEBUG   :: Final FFmpeg Path: ffmpeg
10:11:12 ::   FFhelper    ::  DEBUG   :: FFmpeg validity Test Passed!
10:11:12 ::   FFhelper    ::  DEBUG   :: Found valid FFmpeg Version: `b'5.1.2'` installed on this system
10:11:12 ::    Sourcer    ::  DEBUG   :: Found valid FFmpeg executable: `ffmpeg`.
10:11:12 ::    Sourcer    ::  DEBUG   :: Extracting Metadata...
10:11:12 ::    Sourcer    ::  DEBUG   :: Metadata Extraction completed successfully!
10:11:12 ::   FFdecoder   ::   INFO   :: Using default `rgb24` pixel-format for this pipeline.
10:11:12 ::   FFdecoder   :: CRITICAL :: Manually discarding `-size/-s` FFmpeg parameter from this pipeline.
10:11:12 ::   FFdecoder   ::   INFO   :: Default source resolution will be used for this pipeline for defining output resolution.
10:11:12 ::   FFdecoder   ::   INFO   :: Default source framerate will be used for this pipeline for defining output framerate.
10:11:12 ::   FFdecoder   :: CRITICAL :: Activating Video-Only Mode of Operation.
10:11:12 ::   FFdecoder   ::  DEBUG   :: Executing FFmpeg command: `ffmpeg -vcodec hevc -i /Users/username/Downloads/portrait.MOV -pix_fmt rgb24 -framerate 30.0 -f rawvideo -`
{
  "ffmpeg_binary_path": "ffmpeg",
  "source": "/Users/username/Downloads/portrait.MOV",
  "source_extension": ".MOV",
  "source_video_resolution": [
    1920,
    1080
  ],
  "source_video_pixfmt": "yuv420p",
  "source_video_framerate": 30.0,
  "source_video_decoder": "hevc",
  "source_duration_sec": 4.37,
  "approx_video_nframes": 131,
  "source_video_bitrate": "8140k",
  "source_audio_bitrate": "160k",
  "source_audio_samplerate": "44100 Hz",
  "source_has_video": true,
  "source_has_audio": true,
  "source_has_image_sequence": false,
  "source_demuxer": "",
  "output_frames_resolution": [
    1920,
    1080
  ],
  "output_frames_pixfmt": "yuv420p",
  "output_framerate": 30.0,
  "ffdecoder_operational_mode": "Video-Only"
}
10:11:12 ::   FFdecoder   ::  DEBUG   :: Terminating FFdecoder Pipeline...
10:11:12 ::   FFdecoder   ::   INFO   :: Pipeline terminated successfully.
abhiTronix commented 1 year ago

@philipqueen Thanks. Could you see the output of the following code, if it's correct or having the same landscape problem:

# import the necessary packages
from deffcode import FFdecoder
import cv2

# disable resolution
ffparams = {"-custom_resolution":"null"}

# initialize and formulate the decoder using your portrait video
decoder = FFdecoder("portrait_video.mp4", frame_format="bgr24", verbose=True, **ffparams).formulate()

# grab the BGR24 frames from decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # Show output window
    cv2.imshow("Output", frame)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# terminate the decoder
decoder.terminate()
philipqueen commented 1 year ago

Here is a screenshot of the cv2 imshow output from the code from your last message:

Screen Shot 2023-04-18 at 10 43 35 AM
philipqueen commented 1 year ago

I'm currently looking through the FFMPEG bug alerts to see if this is a known FFMPEG bug, I'll let you know if I find anything.

abhiTronix commented 1 year ago

@philipqueen This bug is because of your video reporting 1920x1080 resolution in FFmpeg. See:Stream #0:0[0x1](und): Video: hevc (Main) (hvc1 / 0x31637668), yuv420p(tv, bt709), 1920x1080, 7680 kb/s, 29.97 fps, 29.97 tbr, 600 tbn (default)

philipqueen commented 1 year ago

That's a good point. I notice below that it says Side data: displaymatrix: rotation of -90.00 degrees - I'm wondering if there is some way that metadata can be used to ensure the video is rotated properly down the line. When I add ffparams = {"-vf": "transpose=1"} to the decoder on an iPhone video, the video is displayed sideways but not stretched/distorted, which is not ideal but is much better for my needs than being stretched/distorted.

There is an older bug related to rotation and iPhone videos. It is supposed to be fixed in ffmpeg 2.7, but this stack overflow post has some discussion about how the rotation is supposed to be handled: stack overflow discussion of iphone rotation bug in FFMPEG

abhiTronix commented 1 year ago

That's a good point. I notice below that it says Side data: displaymatrix: rotation of -90.00 degrees - I'm wondering if there is some way that metadata can be used to ensure the video is rotated properly down the line.

@philipqueen Excellent, This can be used to correctly handle this bug. Could you share sample video (with this bug) for testing? That would be helpful.

When I add ffparams = {"-vf": "transpose=1"} to the decoder on an iPhone video, the video is displayed sideways but not stretched/distorted, which is not ideal but is much better for my needs than being stretched/distorted. There is an older bug related to rotation and iPhone videos. It is supposed to be fixed in ffmpeg 2.7, but this stack overflow post has some discussion about how the rotation is supposed to be handled: stack overflow discussion of iphone rotation bug in FFMPEG

@philipqueen Thanks for bring this to my attention. I'll look into it.

philipqueen commented 1 year ago

Here is a short video that has the bug:

https://user-images.githubusercontent.com/24758117/232957321-788e2aac-f905-44df-ad52-bf9c95c5b5d0.MOV

I really appreciate the quick support! Continue to let me know how I can help.

abhiTronix commented 1 year ago

@philipqueen Thanks, I'll run some tests on my system and Let you know.

abhiTronix commented 1 year ago

@philipqueen The FFMPEG changed the default behavior to auto rotate video sources with rotation metadata in 2015 since v2.7. So to resolve it you need to turn off auto rotation with -noautorotate flag and use (rotate or transpose) filter to rotate manually as follows:

# import the necessary packages
from deffcode import FFdecoder
import cv2

# disable autorotate and transpose clockwise(1)
# 0 = 90° counter-clockwise and vertical flip (default)
# 1 = 90° clockwise.
# 2 = 90° counter-clockwise.
# 3 = 90° clockwise and vertical flip.
ffparams = {"-ffprefixes": ["-noautorotate"], "-vf": "transpose=1"}

# initialize and formulate the decoder using your portrait video
decoder = FFdecoder(
    "test.mov", frame_format="bgr24", verbose=True, **ffparams
).formulate()

# grab the BGR24 frames from decoder
for frame in decoder.generateFrame():

    # check if frame is None
    if frame is None:
        break

    # {do something with the frame here}

    # Show output window
    cv2.imshow("Output", frame)

    # check for 'q' key if pressed
    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        break

# print metadata as `json.dump`
print(decoder.metadata)

# terminate the decoder
decoder.terminate()
abhiTronix commented 1 year ago

I'm adding metadata video-orientation to aware user of video orientation so he/she can make changes accordingly.

philipqueen commented 1 year ago

Thanks @abhiTronix, that rotate flag works well on iphone videos, although on one video it ran it upside down. That's easy to fix manually, but the ideal would be to do it programatically, and have it work regardless of the video passed in.

Would your proposed video-orientation metadata just check if it's vertical, or would it look for the displaymatrix rotation parameter?

abhiTronix commented 1 year ago

That's easy to fix manually, but the ideal would be to do it programatically, and have it work regardless of the video passed in.

@philipqueen Actually I looked into it and I found out not everyone wants there video to be rotated automatically. Like you're manually flipping the video, you don't want to auto-rotate to original. Also, filters used with -noautorotate fail to work correctly if we're adding these flags automatically, and it will break things for devs already using deffcode in there applications. Therefore it is better to be handled by the user. A doc mentioning this issue and solution is enough for a end-user developer.

Would your proposed video-orientation metadata just check if it's vertical, or would it look for the displaymatrix rotation parameter?

Checks for displaymatrix rotation metadata

philipqueen commented 1 year ago

I agree, as long as there is an easy automatic/programatic fix to the issue. Hopefully adding the displaymatrix rotation to the metadata will be enough for that to work properly, with the workflow check the metadata for displaymatrix rotation -> conditionally set the ffparams depending on the metadata -> feed those conditionally set params into the FFdecoder.

philipqueen commented 1 year ago

I found a good temporary solution for my needs. Because deffcode reports the metadata for the problematic videos reversed, but opencv does not, comparing the width reported from deffcode vs the width reported from opencv identifies the problematic videos. So if the two library's metadata reporting agrees, don't transpose the videos, and if they do agree, transpose the videos.

Getting the displaymatrix rotation metadata will provide the next step of indicating which direction to transpose the videos

abhiTronix commented 1 year ago

I found a good temporary solution for my needs. Because deffcode reports the metadata for the problematic videos reversed, but opencv does not, comparing the width reported from deffcode vs the width reported from opencv identifies the problematic videos. So if the two library's metadata reporting agrees, don't transpose the videos, and if they do agree, transpose the videos.

Sounds reasonable. Thanks for insight.

Getting the displaymatrix rotation metadata will provide the next step of indicating which direction to transpose the videos

Yes. I've added source_video_orientation metadata to get display source video orientation that checks for displaymatrix rotation metadata

philipqueen commented 1 year ago

@abhiTronix thanks, I got it working very smoothly with the dev version! Here's the code that is correctly rotating my videos for anyone that is curious:

sourcer = Sourcer(input_video_pathstring).probe_stream()
metadata_dictionary = sourcer.retrieve_metadata()

tranposition_dictionary = {
    90.0: "transpose=cclock",
    -270.0: "transpose=cclock",
    -90.0: "transpose=clock",
    270.0: "transpose=clock",
    180.0: "transpose=cclock,transpose=cclock",
}

if metadata_dictionary["source_video_orientation"] != 0:
    ffparams = {
        "-ffprefixes": ["-noautorotate"],
        "-vf": tranposition_dictionary[
            metadata_dictionary["source_video_orientation"]
        ],
    }
else:
    ffparams = {}
abhiTronix commented 2 months ago

Successfully resolved in commit https://github.com/abhiTronix/deffcode/commit/b6ee94b2c0b9db95dbd600fbeb311701d8358b05