kkroening / ffmpeg-python

Python bindings for FFmpeg - with complex filtering support
Apache License 2.0
10.04k stars 889 forks source link

Converting numpy array to video #246

Open Santhosh1509 opened 5 years ago

Santhosh1509 commented 5 years ago

I'm using OpenCV for processing a video, saving the processed video

Example:

import numpy as np
import cv2

cap = cv2.VideoCapture(0)

# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output.avi',fourcc, 20.0, (640,480))

while(cap.isOpened()):
    ret, frame = cap.read()
    if ret==True:
        frame = cv2.flip(frame,0)

        # write the flipped frame
        out.write(frame)

        cv2.imshow('frame',frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    else:
        break

# Release everything if job is finished
cap.release()
out.release()
cv2.destroyAllWindows()

Source file is FULL HD 2 minutes clip in avi format with Data Rate 7468kbps Saved file is FULL HD 2 minutes clip in avi format with Data Rate 99532kbps

this is confusing if i save each frame and give it to input, I get an error in the .output saving there is no such file

import ffmeg
(
    ffmpeg
    .input('/path/to/jpegs/*.jpg', pattern_type='glob', framerate=25)
    .output('movie.mp4')
    .run()
)

How do i save the video as the same size as the source using ffmeg-python?

kylemcdonald commented 5 years ago

not sure if this is what you were asking, but here is some code to save frames from memory straight to a video file. if you chop this up a little you could hack it into your initial code and avoid writing the jpgs to disk:

def vidwrite(fn, images, framerate=60, vcodec='libx264'):
    if not isinstance(images, np.ndarray):
        images = np.asarray(images)
    n,height,width,channels = images.shape
    process = (
        ffmpeg
            .input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height))
            .output(fn, pix_fmt='yuv420p', vcodec=vcodec, r=framerate)
            .overwrite_output()
            .run_async(pipe_stdin=True)
    )
    for frame in images:
        process.stdin.write(
            frame
                .astype(np.uint8)
                .tobytes()
        )
    process.stdin.close()
    process.wait()

Edit 2020-01-28: My working version of this function is backed by a small class, implemented in my python-utils/ffmpeg.py

Santhosh1509 commented 5 years ago

@kylemcdonald Thank You it worked

How can I alter crf?

jpreiss commented 4 years ago

Is @kylemcdonald's code example still the preferred way to stream frames from in-memory numpy arrays to an ffmpeg process?

jblugagne commented 4 years ago

When I try to run @kylemcdonald's function on an image array of 228x2048x2048x3 np.uint8, only 65 frames are saved, and it looks like a bunch of them are skipped

ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 4.8.2 (GCC) 20140120 (Red Hat 4.8.2-15)
  configuration: --prefix=/home/jeanbaptiste/.conda/envs/ffmpeg_env --disable-doc --disable-openssl --enable-shared --enable-static --extra-cflags='-Wall -g -m64 -pipe -O3 -march=x86-64 -fPIC' --extra-cxxflags='-Wall -g -m64 -pipe -O3 -march=x86-64 -fPIC' --extra-libs='-lpthread -lm -lz' --enable-zlib --enable-pic --enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-avresample --enable-libfreetype --enable-gnutls --enable-libx264 --enable-libopenh264
  libavutil      56. 14.100 / 56. 14.100
  libavcodec     58. 18.100 / 58. 18.100
  libavformat    58. 12.100 / 58. 12.100
  libavdevice    58.  3.100 / 58.  3.100
  libavfilter     7. 16.100 /  7. 16.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  1.100 /  5.  1.100
  libswresample   3.  1.100 /  3.  1.100
  libpostproc    55.  1.100 / 55.  1.100
Input #0, rawvideo, from 'pipe:':
  Duration: N/A, start: 0.000000, bitrate: 2516582 kb/s
    Stream #0:0: Video: rawvideo (RGB[24] / 0x18424752), rgb24, 2048x2048, 2516582 kb/s, 25 tbr, 25 tbn, 25 tbc
Stream mapping:
  Stream #0:0 -> #0:0 (rawvideo (native) -> h264 (libx264))
[libx264 @ 0x1e1a400] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x1e1a400] profile High, level 5.0
[libx264 @ 0x1e1a400] 264 - core 152 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=7 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '/run/media/jeanbaptiste/SAMSUNG/compression_test/test2.mp4':
  Metadata:
    encoder         : Lavf58.12.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 2048x2048, q=-1--1, 7 fps, 14336 tbn, 7 tbc
    Metadata:
      encoder         : Lavc58.18.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
frame=    9 fps=0.0 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=16 speed=   0x    
frame=   16 fps= 16 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=34 speed=   0x    
frame=   23 fps= 15 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=54 speed=   0x    
frame=   31 fps= 15 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=73 speed=   0x    
frame=   39 fps= 15 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=93 speed=   0x    
frame=   46 fps= 13 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=111 speed=   0x    
frame=   51 fps= 13 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=126 speed=   0x    
frame=   55 fps= 12 q=0.0 size=       0kB time=00:00:00.00 bitrate=N/A dup=0 drop=136 speed=   0x    
frame=   60 fps= 12 q=24.0 size=     512kB time=00:00:00.14 bitrate=29348.5kbits/s dup=0 drop=147 speed=0.0282x    
frame=   63 fps= 11 q=24.0 size=    1024kB time=00:00:00.57 bitrate=14679.0kbits/s dup=0 drop=157 speed=0.102x    
frame=   65 fps=6.3 q=-1.0 Lsize=    9993kB time=00:00:08.85 bitrate=9242.1kbits/s dup=0 drop=163 speed=0.86x    
video:9991kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.015101%
[libx264 @ 0x1e1a400] frame I:2     Avg QP:18.44  size:381836
[libx264 @ 0x1e1a400] frame P:36    Avg QP:19.80  size:169252
[libx264 @ 0x1e1a400] frame B:27    Avg QP:20.04  size:124944
[libx264 @ 0x1e1a400] consecutive B-frames: 43.1%  3.1%  4.6% 49.2%
[libx264 @ 0x1e1a400] mb I  I16..4:  4.5% 89.8%  5.7%
[libx264 @ 0x1e1a400] mb P  I16..4:  1.1% 30.9%  0.4%  P16..4: 34.6% 16.1% 10.5%  0.0%  0.0%    skip: 6.5%
[libx264 @ 0x1e1a400] mb B  I16..4:  0.4% 12.7%  0.0%  B16..8: 55.2% 10.8%  3.0%  direct: 3.6%  skip:14.2%  L0:53.9% L1:44.4% BI: 1.8%
[libx264 @ 0x1e1a400] 8x8 transform intra:95.0% inter:71.9%
[libx264 @ 0x1e1a400] coded y,uvDC,uvAC intra: 86.4% 0.0% 0.0% inter: 40.4% 0.0% 0.0%
[libx264 @ 0x1e1a400] i16 v,h,dc,p: 16%  9% 34% 42%
[libx264 @ 0x1e1a400] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 12%  8% 49%  5%  5%  5%  5%  5%  5%
[libx264 @ 0x1e1a400] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 16%  8% 28%  9%  9%  9%  7%  8%  6%
[libx264 @ 0x1e1a400] i8c dc,h,v,p: 100%  0%  0%  0%
[libx264 @ 0x1e1a400] Weighted P-Frames: Y:5.6% UV:0.0%
[libx264 @ 0x1e1a400] ref P L0: 41.4% 13.4% 27.6% 16.5%  1.0%
[libx264 @ 0x1e1a400] ref B L0: 68.8% 26.7%  4.5%
[libx264 @ 0x1e1a400] ref B L1: 88.8% 11.2%
[libx264 @ 0x1e1a400] kb/s:8813.73

Am I missing something here?

jpreiss commented 4 years ago

@jblugagne I encountered a related problem - I was getting duplicated frames in my stream. I had to pass the r=framerate argument to the input() method instead of the output() method.

The ffmpeg documentation says:

-r[:stream_specifier] fps (input/output,per-stream)

Set frame rate (Hz value, fraction or abbreviation).

As an input option, ignore any timestamps stored in the file and instead generate timestamps assuming constant frame rate fps.

As an output option, duplicate or drop input frames to achieve constant output frame rate fps.

Since our "input" is a stream of raw video frames over a pipe, it should not contain any timestamps at all, so it makes sense that we would need some mechanism of specifying timestamps like the "input" option.

I don't fully understand the behavior of the "output option". If our input stream has no timestamps, how did it decide to drop frames for you, but duplicate them for me? Are the timestamps generated implicitly by the real wall clock time when the frames arrive over the pipe? Regardless, dropping and duplicating frames are both bad for this application.

jblugagne commented 4 years ago

@jpreiss thank you! That solved my problem. Not sure what is going on with the r output option thing either.

lminer commented 4 years ago

Is there a way to do this where you pass in a numpy array (audio in this case) and get a numpy array in return?

jaehobang commented 4 years ago

When trying to run @kylemcdonald's function written above with the modifications of frame rate given by @jpreiss , I am running into an error of Broken Pipe. The input is an 15000x241x369x3 np.uint8 array. The error is as follows:

---------------------------------------------------------------------------
BrokenPipeError                           Traceback (most recent call last)
<ipython-input-30-41eb52741086> in <module>
----> 1 vidwrite(output_filename, images_cut)

<ipython-input-29-34624c1ce396> in vidwrite(fn, images, framerate, vcodec)
     16         process.stdin.write(
     17             frame
---> 18                 .astype(np.uint8)
     19                 .tobytes()
     20         )

BrokenPipeError: [Errno 32] Broken pipe

It seems that this error is raised while trying to write the 2nd frame.

Did anyone encounter a similar issue or know of a fix? Thank you in advance.

valin1 commented 3 years ago

@jaehobang Were you able to figure out this problem? Because I am having the same problem with a [Errno 32] Broken pipe error.

samrere commented 3 years ago

vidwrite('test', ...) will produce broken pipe error, but vidwrite('test.mp4', frames) will be fine

solankiharsh commented 3 years ago

@kylemcdonald Thank you so much. I was able to implement your provided lines of code in my use case.

However, I need help with one part. Is there a way to have the separate h264 encoded frames instead of one .h264 file.

This is what I am doing exactly:

def start_streaming(self,channel_name): request = api.RenderRequest() request.deeptwin_uuid = self._deeptwin_uuid request.neutral_expr_coeffs.extend(self._deeptwin.neutral_expr_coeffs) response = self._client.Start(request) print('Started Streaming ...')

10 frames, resolution 256x256, and 1 fps

    #width, height, n_frames, fps = 256, 256, 10, 1
    path = 'out.h264'
    #cv2.imwrite(path, img)
    #print(f'written {res.status} {res.id} to file at {datetime.datetime.now()}')
    process = (
        ffmpeg
        .input('pipe:', format='rawvideo', pix_fmt='bgr24', s='{}x{}'.format(self._deeptwin.image_width, self._deeptwin.image_height))
        .output(path, pix_fmt='yuv420p', vcodec='libx264', r=25)
        .overwrite_output()
        .run_async(pipe_stdin=True)
    )
    for res in response:
        print(f'{res.status} image {res.id} rendered at {datetime.datetime.now()}')
        img = np.frombuffer(res.image, dtype=np.uint8)
        img = img.reshape((int(self._deeptwin.image_height * 1.5), self._deeptwin.image_width))
        img = cv2.cvtColor(img, cv2.COLOR_YUV2BGR_I420)
        print(f'before write {res.status} {res.id} :: {datetime.datetime.now()}')
        #path = f'{storage}/{channel_name}/{res.status}-{res.id}.h264'

        for frame in img:
            process.stdin.write(
                frame
                .astype(np.uint8)
                .tobytes()
            )
    process.stdin.close()
    process.wait()
ayushjn20 commented 3 years ago

@jpreiss Quoting you,

Since our "input" is a stream of raw video frames over a pipe, it should not contain any timestamps at all, so it makes sense that we would need some mechanism of specifying timestamps like the "input option.

@jpreiss @kylemcdonald What is that mechanism where we can specify timestamps in the input options?

I am facing similar issues because the original input video (from where frames need to be extracted out) has variable FPS. The frames are extracted using OpenCV's VideoCapture, then each frame is processed independently and the sequence of frames has to be written down to a new video again. I am using the same method to convert that sequence of processed images to video as mentioned by @kylemcdonald here.

jpreiss commented 3 years ago

@ayushjn20 sorry, I have no idea how to work with variable frame rates.

CharlesSS07 commented 3 years ago

@jaehobang Were you able to figure out this problem? Because I am having the same problem with a [Errno 32] Broken pipe error.

[Error 32] Broken Pipe means the process errored and closed so there was no pipe to pipe input into. You can figure out what the error was by looking in process.stderr, like so:

import ffmpeg
import io

def vidwrite(fn, images, framerate=60, vcodec='libx264'):
    if not isinstance(images, np.ndarray):
        images = np.asarray(images)
    _,height,width,channels = images.shape
    process = (
        ffmpeg
            .input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height), r=framerate)
            .output(fn, pix_fmt='yuv420p', vcodec=vcodec, r=framerate)
            .overwrite_output()
            .run_async(pipe_stdin=True, overwrite_output=True, pipe_stderr=True)
    )
    for frame in images:
        try:
            process.stdin.write(
                frame.astype(np.uint8).tobytes()
            )
        except Exception as e: # should probably be an exception related to process.stdin.write
            for line in io.TextIOWrapper(process.stderr, encoding="utf-8"): # I didn't know how to get the stderr from the process, but this worked for me
                print(line) # <-- print all the lines in the processes stderr after it has errored
            process.stdin.close()
            process.wait()
            return # cant run anymore so end the for loop and the function execution

In my case , it was just

Unknown encoder 'libx264'

because I hadnt't installed that library

omrivolk commented 2 years ago

@kylemcdonald do you know how to achieve this with rgba? My numpy array shape is (m,n,4) with the 4th value being the opacity between 0-1. I tried switching rgb24 to rgba and yuv420p to yuva420p but it does not work. The alpha values are ignored.

I want to overlay my video on a map so I need some parts to be transparent.

antortjim commented 2 years ago

Is it possible to implement CUDA support in this python wrapper? I have made a small repository where I write a constant random frame to video using ffmpeg with CUDA support, but I am not getting the performance I expected.

Maybe my ffmpeg flags are not correct?

Any help would be very appreciated :)

PS I built ffmpeg with CUDA support enabled PSS I am trying to do this because the cv2.cuda.VideoWriter is apparently not supported in Linux at the moment https://github.com/opencv/opencv_contrib/issues/3044

LvJC commented 2 years ago

@jaehobang Have you figured it out? I have the same problem...

Yaxit commented 2 years ago

I'm following as well because I have a similar issue. My input buffer comes from a stream from youtube. In my case, the conversion works out well, but I still get the BrokenPipe exception at the end. Any idea why this happens?

from io import BytesIO
buff = BytesIO()
streams = pytube.YouTube('https://www.youtube.com/watch?v=xxxxx').streams
streams.filter(only_audio=True).first().stream_to_buffer(buff)

buff.seek(0)
process = (
    ffmpeg
    .input('pipe:', ss=420, to=430, f='mp4', )
    .output('out.wav', ac=1, ar=16000, acodec='pcm_s16le')
    .overwrite_output()
    .run_async(pipe_stdin=True)
)
process.stdin.write(buff.read()) # <-- BrokenPipe here
process.stdin.close()
process.wait()