FCLC / Multi-Plexer

Goal: Low power cluster capable of serving 24+ streams of 4KHDR60 source transcodes while consuming no more than 100W at peak and idling at less than 10W
MIT License
25 stars 1 forks source link

new version of vf scale cuda is a major bodge- but should be able to … #6

Closed FCLC closed 3 years ago

FCLC commented 3 years ago

…do reinhard tonemapping

copy of email to ffmpeg-devel mailing list follows:

Hey everyone!

Trying to wrap my mind around how to deal with cuda HW frames and how to implement them.

The goal of this filter once completed will be to take in a cuda frame, tonemap the value to a given specification using a user requested algorithm (mobius, hable reinhard clip etc.)

This is useful because it completes (should) outperform cpu based tonemapping by multiple 1-3 orders of magnitude depending on the gpu used for the filter.

I've based the attached filter off of the vf_scale_cuda.cu filter.

For ease of developement, I've kept everything the same including the name of the filter, only changing the funtion within the file. This is very much a bodge to facilitate development. As such, for testing, this file should replace the vf_scale_cuda.cu file in ffmpeg/libavfilter/vf_scale_cuda.cu

FFmpeg should then be compiled as standard for cuda filters and should be called as you would call the standard vf_scale_cuda filter. The command would be similar to: ffmpeg -y -vsync 0 -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -vf scale_cuda=Source_width:Source_Height -c:a copy -c:v h264_nvenc -b:v 5M output.mp4

The above should decode in hardware, tonemap the frame on gpu and re-encode in hardware at a given bitrate.

will be in the freenode soon after sending this email (going to put on another cup of coffee )

Thanks,

FelixCLC (felix__)

Caviat: Like all HW filters, how effective this is will depend on how much overhead is faced by doing a memcpy over the pcie bus to the gpu itself, then passing the data back once processed.