rigaya / NVEnc

NVENCによる高速エンコードの性能実験
https://rigaya34589.blog.fc2.com/blog-category-17.html
Other
1.03k stars 108 forks source link

Low transcoding performance with the latest release (v7.49) #577

Closed otuncelli closed 2 months ago

otuncelli commented 2 months ago

I'm getting about 70% GPU Video Encoding engine utilization on my 1050 ti while transcoding a video with the latest version (v7.49)

7.48 doesn't have this problem. Everything else is the same.

Comparison 7.49 vs 7.48

comparison

More info:

bitspyer commented 2 months ago

can confirmed this issue. with 7.49 there is a frame drop of aprox 40%

otuncelli commented 2 months ago

Also --tune uhq and --tune lossless give me error.

nvenc : Failed to Initialize the encoder
nvenc : .: 8 (NVENC indicates that one or more of the parameter passed to the API call is invalid.)

When I use an invalid value for --tune

Error: Invalid value "invalid" for "--tune"
  Option value should be one of below...
    auto, 0, 1, 2, 3

But these values are invalid as well.

Error: Invalid value "auto" for "--tune"
  Option value should be one of below...
    auto, 0, 1, 2, 3
Jigsawbg commented 2 months ago

@otuncelli --tune uhq -> "Tune presets for latency tolerant encoding for higher quality. Only supported for HEVC on Turing+ architectures" from SDK. You use Pascal architecture (GTX 1050 Ti). You need at least GTX 1630 which is Turing. For --lossless I don't know.

otuncelli commented 2 months ago

@otuncelli --tune uhq -> "Tune presets for latency tolerant encoding for higher quality. Only supported for HEVC on Turing+ architectures" from SDK. You use Pascal architecture (GTX 1050 Ti). You need at least GTX 1630 which is Turing.

Ah, I see. Thank you for the info.

rigaya commented 2 months ago

The performance drop was due to the new --lookahead-level option added in NVEnc 7.49, defaulted to auto.

I've changed the default to 0 in NVEnc 7.50, and will remove the performance drop.

otuncelli commented 2 months ago

Awesome! I can confirm the issue has been fixed.

I'm guessing performance drop on higher lookahead-levels is expected since I'm using a Pascal arch card.