Open gordan-bobic opened 9 years ago
Hello @gordan-bobic,
I'll follow up on this shortly.
Also, FYI, Building on Linux no longer requires the Windows NVENC SDK. The Linux SDK v4.0.0 includes the required headers. It may help others trying to use this to update the README file with this info. I successfully built this on Linux using only the Linux SDK files.
Also - the nvenc patch applies cleanly against ffmpeg 2.5.1 so it may be an idea to rebase to the latest upstream ffmpeg while it's still easy to do. :)
I have tested 2.5.1 with nvenc and most of the libraries included on EL7 (had to pull in many dependencies from various places) on GT630 Kepler (~60fps, 1080p slow profile (looking at the source, am I right in concluding that there is only fast and slow profiles with nothing inbetween, and there is no veryslow?)) and GTX860M Maxwell on an laptop with Intel/Nvidia Optimus setup (~130fps, has to be invoked via bumblebee optirun).
Interesting...
Information from my part shows the following configuration as available:
Encoder nvenc [Nvidia NVENC h264 encoder]:
Threading capabilities: no
Supported pixel formats: nv12 yuv444p
nvenc AVOptions:
-preset
Also, on a sample encode via TraGtor:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/media/lin/CHEEKS/Cheeks Media Library/ShadowPlay/3DMark/3DMark 12.02.2014 - 05.39.31.35.mp4': Metadata: major_brand : mp42 minor_version : 0 compatible_brands: mp41isom Duration: 00:03:59.34, start: 0.000000, bitrate: 10138 kb/s Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, smpte170m/smpte170m/bt470m), 1920x1080 [SAR 1:1 DAR 16:9], 10032 kb/s, 59.98 fps, 60 tbr, 60k tbn, 120 tbc (default) Metadata: handler_name : VideoHandler encoder : AVC Coding Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 97 kb/s (default) Metadata: handler_name : SoundHandler [nvenc @ 0x2e91d60] Unsupported h264 profile requested, falling back to high Output #0, matroska, to '/home/lin/Desktop/Encodes/3DMark 12.02.2014 - 05.39.31.35.mkv': Metadata: major_brand : mp42 minor_version : 0 compatible_brands: mp41isom encoder : Lavf56.15.100 Stream #0:0(und): Video: h264 (nvenc) (H264 / 0x34363248), nv12, 1280x720 [SAR 1:1 DAR 16:9], q=-1--1, 1575 kb/s, 60 fps, 1k tbn, 60 tbc (default) Metadata: handler_name : VideoHandler encoder : Lavc56.13.100 nvenc Stream #0:1(und): Audio: aac (libfaac) ([255][0][0][0] / 0x00FF), 48000 Hz, stereo, s16, 96 kb/s (default) Metadata: handler_name : SoundHandler encoder : Lavc56.13.100 libfaac Stream mapping: Stream #0:0 -> #0:0 (h264 (native) -> h264 (nvenc)) Stream #0:1 -> #0:1 (aac (native) -> aac (libfaac))
It seems to fall back on high profile, same as CRF on 18 in Visual Quality.
I think we are talking about three separate things here:
1) "profile" refers to the MPEG4 profile in terms of what codec features are allowed, as explained here: http://en.wikipedia.org/wiki/H.264/MPEG-4_AVC I believe this is what it is referring to when it says "falling back to high"
2) "preset", usually specified as [ultrafast | veryfast | fast | medium | slow | veryslow | placebo] refers to the encoder settings in terms of how hard (and how CPU intensively) the codec should try to squeeze the video down to a certain quality level or data rate. Looking at the patch, NVENC seems to support fast and slow, but nothing inbetween, and no veryslow, which is what I normally use.
3) CRF varies the quantizer setting during the video so that low motion scenes are encoded with a low quantizer setting for better visual apperance while high motion scenese are encoded with a higher quantizer setting to reduce the data rate. This works out well because human eye is good at picking out detail in a static image and motion in a moving image, but not both at the same time, so in terms of visual perception, the quality is constant, even though in mathematical terms, it isn't.
There does not appear to be a CRF setting available when NVENC is used. A fixed quantizer setting does appear to be supported, and the data rate is not mandatory. Unfortunately, setting -qp=23 (seemingly default at CRF=18 with ffmpeg, according to the output it gives during encoding) and the slow preset without the data rate results in a file much bigger than what x264 produceswith -crf 18 -preset veryslow. I'm not sure if something similar would be possible to implement with NVENC, but if it were possible it would certainly be useful since the throughput on Maxwell is a match for my 12-core Xeon at a fraction of a power consumption.
Looks like this thread is dead, but I had the same question. Looking through NVENC documentation, however, I don't think Nvidia has opened that up as an option.
Yeah, that's unfortunate. NVENC is a great option when high speed and lower power consumption are priorities (or when the CPU is very underpowered), but in terms of achieving the optimal compromise between file size and visual quality, it seems libx264 is still the best option.
I found this parameter, by reading the official developer documentation:
Target quality: This mode is specified by setting rateControlMode to one of the VBR modes and desired target quality in targetQuality . The range of this target quality is 1 to 51, roughly corresponding to the range of possible QP values. In this mode, the encoder tries to maintain constant quality for each frame, by allowing the bitrate to vary subject to the bitrate parameters specified in maxBitRate and averageBitRate. If maxBitRate and averageBitRate are not specified, the encoder will use as many bits as needed to achieve the target quality. However, if both parameters are set, they will form the upper bound on the actual bitrate. The bit rate will become constrained, resulting in the desired target quality possibly not being achieved
So it should be enough to set the Level + Profile and the targetQuality. Since CRF has values between 0 and 51 this looks quite like you can just set it to a value about 23 but I didn't test it.
Can you please verify it?
Edit: also found this
5.29.2.22 uint16_t NV_ENC_RC_PARAMS::targetQuality [in]: Target CQ (Constant Quality) level for VBR mode (range 0-51 with 0-automatic)
@Grauen that sounds like the q (quantizer) parameter rather than the CRF. Difference being that CRF also considers how much motion is going on in the frames, and reduces quality for very busy sequences because human eyes can't pick put as much detail in fast motion.
@gordan-bobic You're right. Sorry I'm a noob. So this param would only work as a compressing method for single frames in a video. It will not compress multiple frames by detecting fast motions.
I'm not even sure how CUDA / NVENC works... but is it even possible to detect fast motions with nvenc encoding?
It'll work for video just fine, it's just lacks the sophistication of the CRF method which varies q based on how much motion in the frame is detected. There is no point maintaining high level of detail on a fast moving scene because human eyes wouldn't be able to pick out the detail anyway, so CRF increases the q value to reduce the bit rate and in fast moving scenes, the bit rate spikes anyway, so the also results in smoothing out the bit rate. There's nothing wrong with encoding at fixed q, it's just not going to result in as tight a bit rate for any fixed visual quality.
Is there a way to implement some kind of an equivalent to CRF when using NVENC? I have been comparing the output quality of NVENC against x264 veryslow, and it is obviously not as good at the same bit rate. Worse, to get an idea of a reasonable bit rate, I encoded with x264 with -crf 18 and gave the resulting ABR to nvenc - except at that point there's no point in re-encoding with nvenc (other than for testing).
Is there a sensible way to implement the equivalent of crf, or a profile yielding similar visual performance and bit rate to what x264 yields with -crf 18?
As it is, nvenc is only useful in cases when we already know what exact bit rate we want to target and we care only about doing it as fast as possible (e.g. realtime transcoding to a target bit rate for streaming over a connection that cannot handle the full data rate of the raw content).
It is not useful in cases where what is required is the equivalent of -crf 18 -preset veryslow without the target bit rate (the use case that comes to mind is the one of someone extracting their DVD collection for use with their Plex server)