i-rinat / freshplayerplugin

ppapi2npapi compatibility layer
MIT License
727 stars 52 forks source link

Now Flash videos are a pain very very slow #327

Open Elrondo46 opened 8 years ago

Elrondo46 commented 8 years ago

firefox

libva info: VA-API version 0.39.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_0_35
libva info: va_openDriver() returns 0
NOT SANDBOXED
[fresh  5201] not implemented: PPB_OpenGLES2VertexArrayObject;1.0
[fresh  5201] not implemented: PPB_OpenGLES2DrawBuffers(Dev);1.0
Vector smash protection is enabled.

Tried with vdpau too same prob, said it's activated but slower than software accel.

Nvidia close source with firefox last version.

i-rinat commented 8 years ago

What does "now" mean? It was better before and became worse recently? If so, what changed between good and bad states? Was freshplayerplugin version changed?

Currently there is known problem in the code: it enabled and disables GL context on every GL call. On integrated adapters that usually is fine. But on discrete adapters, like nVidia cards, it could cause a significant synchronization overhead.

Elrondo46 commented 8 years ago

That's not a discrete card. Just have an nvidia GeForce 970 in primary PCIE port in my tower. 2 weeks ago there is an update of freshplayer and now the videos are slow, like software rendering...

Tried all with vdpau and vaapi same problem. But in youtube the HW accel and decode says it's on.

Elrondo46 commented 8 years ago

Better log:

NOT SANDBOXED
[fresh 31540] not implemented: PPB_FileRef;1.2
[fresh 31540] not implemented: PPB_OpenGLES2VertexArrayObject;1.0
[fresh 31540] not implemented: PPB_OpenGLES2DrawBuffers(Dev);1.0
Vector smash protection is enabled.
[fresh] [warning] issue_frame, no free buffer available
[h264 @ 0x7fe71efe6000] co located POCs unavailable
[h264 @ 0x7fe71efe6000] co located POCs unavailable
[h264 @ 0x7fe71efe6000] co located POCs unavailable
[h264 @ 0x7fe71efe6000] co located POCs unavailable
[h264 @ 0x7fe71efe6000] co located POCs unavailable
[h264 @ 0x7fe71efe6000] co located POCs unavailable
[h264 @ 0x7fe71efe6000] co located POCs unavailable
[h264 @ 0x7fe71efe6000] co located POCs unavailable
[h264 @ 0x7fe71efe6000] illegal short term buffer state detected
i-rinat commented 8 years ago

That's not a discrete card. Just have an nvidia GeForce 970 in primary PCIE port in my tower.

It's a discrete video adapter. That term doesn't mean it's something external to the computer, it means a device with own memory. There are also so-called integrated video adapters, which share memory with a CPU.

2 weeks ago there is an update of freshplayer

From what to what? Did you try version from repository's master branch?

and now the videos are slow, like software rendering

Only video decoding is slow? Does it help to disable it by setting enable_hwdec = 0 in configuration file?

Elrondo46 commented 8 years ago

It's a discrete video adapter. That term doesn't mean it's something external to the computer, it >means a device with own memory. There are also so-called integrated video adapters, which share >memory with a CPU.

Ok understood, I have a discrete card

From what to what? Did you try version from repository's master branch?

Don't remember sorry, I get the source/package here in AUR channel: https://aur.archlinux.org/packages/freshplayerplugin/ (tried with freshplayerplugin-fit too)

Only video decoding is slow? Does it help to disable it by setting enable_hwdec = 0 in configuration >file?

Tried, horrible too, it helps nothing

Elrondo46 commented 8 years ago

I precise vdpau is correctly enabled

vdpauinfo
display: :0   screen: 0
API version: 1
Information string: NVIDIA VDPAU Driver Shared Library  361.42  Tue Mar 22 17:29:16 PDT 2016

Video surface:

name   width height types
-------------------------------------------
420     4096  4096  NV12 YV12 
422     4096  4096  UYVY YUYV 

Decoder capabilities:

name                        level macbs width height
----------------------------------------------------
MPEG1                           0 65536  4080  4080
MPEG2_SIMPLE                    3 65536  4080  4080
MPEG2_MAIN                      3 65536  4080  4080
H264_BASELINE                  41 65536  4096  4096
H264_MAIN                      41 65536  4096  4096
H264_HIGH                      41 65536  4096  4096
VC1_SIMPLE                      1  8190  2048  2048
VC1_MAIN                        2  8190  2048  2048
VC1_ADVANCED                    4  8190  2048  2048
MPEG4_PART2_SP                  3  8192  2048  2048
MPEG4_PART2_ASP                 5  8192  2048  2048
DIVX4_QMOBILE                   0  8192  2048  2048
DIVX4_MOBILE                    0  8192  2048  2048
DIVX4_HOME_THEATER              0  8192  2048  2048
DIVX4_HD_1080P                  0  8192  2048  2048
DIVX5_QMOBILE                   0  8192  2048  2048
DIVX5_MOBILE                    0  8192  2048  2048
DIVX5_HOME_THEATER              0  8192  2048  2048
DIVX5_HD_1080P                  0  8192  2048  2048
H264_CONSTRAINED_BASELINE      41 65536  4096  4096
H264_EXTENDED                  41 65536  4096  4096
H264_PROGRESSIVE_HIGH          41 65536  4096  4096
H264_CONSTRAINED_HIGH          41 65536  4096  4096
H264_HIGH_444_PREDICTIVE       41 65536  4096  4096
HEVC_MAIN                      --- not supported ---
HEVC_MAIN_10                   --- not supported ---
HEVC_MAIN_STILL                --- not supported ---
HEVC_MAIN_12                   --- not supported ---
HEVC_MAIN_444                  --- not supported ---

Output surface:

name              width height nat types
----------------------------------------------------
B8G8R8A8         16384 16384    y  Y8U8V8A8 V8U8Y8A8 A4I4 I4A4 A8I8 I8A8 
R10G10B10A2      16384 16384    y  Y8U8V8A8 V8U8Y8A8 A4I4 I4A4 A8I8 I8A8 

Bitmap surface:

name              width height
------------------------------
B8G8R8A8         16384 16384
R8G8B8A8         16384 16384
R10G10B10A2      16384 16384
B10G10R10A2      16384 16384
A8               16384 16384

Video mixer:

feature name                    sup
------------------------------------
DEINTERLACE_TEMPORAL             y
DEINTERLACE_TEMPORAL_SPATIAL     y
INVERSE_TELECINE                 y
NOISE_REDUCTION                  y
SHARPNESS                        y
LUMA_KEY                         y
HIGH QUALITY SCALING - L1        y
HIGH QUALITY SCALING - L2        -
HIGH QUALITY SCALING - L3        -
HIGH QUALITY SCALING - L4        -
HIGH QUALITY SCALING - L5        -
HIGH QUALITY SCALING - L6        -
HIGH QUALITY SCALING - L7        -
HIGH QUALITY SCALING - L8        -
HIGH QUALITY SCALING - L9        -

parameter name                  sup      min      max
-----------------------------------------------------
VIDEO_SURFACE_WIDTH              y         1     4096
VIDEO_SURFACE_HEIGHT             y         1     4096
CHROMA_TYPE                      y  
LAYERS                           y         0        4

attribute name                  sup      min      max
-----------------------------------------------------
BACKGROUND_COLOR                 y  
CSC_MATRIX                       y  
NOISE_REDUCTION_LEVEL            y      0.00     1.00
SHARPNESS_LEVEL                  y     -1.00     1.00
LUMA_KEY_MIN_LUMA                y  
LUMA_KEY_MAX_LUMA                y  
Roger commented 8 years ago

@Elrondo46 try: grep freshplayerplugin /var/log/pacman.log to see from what to what

hlavki commented 8 years ago

Same issue in openSUSE Tumbleweed with nvidia gpu. It worked, but I don't know which upgrade did it. Maybe Firefox 46?

i-rinat commented 8 years ago

I didn't have a chance to test it on machine with nVidia GPU yet.

hlavki commented 8 years ago

@i-rinat let me know if I can somehow help

i-rinat commented 8 years ago

@hlavki, I've just noticed you're using openSUSE Tumbleweed. Are you using precompiled package from repository?

As far as I can tell they are building it with native GLES2, which can be buggy on nVidia proprietary drivers. I see there potential using of GTK+ 3, which can smash both GTK+ 2 and GTK+ 3 in a single process with unpredictable results. Could you compile freshplayerplugin from source? (If you don't have ffmpeg, you'll probably need to disable hw accelerated decoding: cmake -DWITH_HWDEC=0 ..)

hlavki commented 8 years ago

Tried build with and without -DWITH_HWDEC=0 and result is same. I think it's even slower.

i-rinat commented 8 years ago

(Just to be sure.) You made a new source checkout, then created build directory, run there cmake -DWITH_HWDEC=0 .., then make, and then copied generated libfreshwrapper-flashplayer.so into ~/.mozilla/plugins/. On the machine where you are testing, there are no other instances of compiled freshplayerplugin installed. And you used original code from this source repo without modifications. Is that correct?

Aside from that, does "videos are slow" include high CPU usage? If yes, could you paste here perf report output? To do that, you'll need some writable directory. Run there

perf record -a sleep 60

as root user (or via sudo). That will record performance counters for 60 seconds. While those 60 seconds last, open a video, in other words, try to reproduce the bug. When perf record completes, run in the same directory, also as root:

perf report

I need about 30-40 first lines of that output. You can also redirect output of perf report to a file. That will change output format a bit, but that's fine too. Again, no full output is required (it's huge), 30-40 first lines a good enough.

perf is a performance measurement tool which is built from Linux sources. I guess, it should be in package "perf" in OpenSUSE.

hlavki commented 8 years ago

Yes, build was same as you describe. I removed freshplayerplugin package from distro, build with cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_HWDEC=0 .., then make, and then copied generated libfreshwrapper-flashplayer.so into ~/.mozilla/plugins/

One thing is, that CPU usage is higher, but not 100%. Also it was split into different cores. Output from perf is:

Samples: 378K of event 'cycles:pp', Event count (approx.): 209043432306
Overhead  Command          Shared Object                            Symbol
   7,78%  X                [kernel.kallsyms]                        [k] copy_user_generic_unrolled
   6,79%  swapper          [kernel.kallsyms]                        [k] intel_idle
   2,84%  plugin-containe  [kernel.kallsyms]                        [k] copy_user_generic_unrolled
   2,35%  X                nvidia_drv.so                            [.] 0x0000000000098987
   2,03%  X                nvidia_drv.so                            [.] 0x000000000009901b
   2,01%  X                nvidia_drv.so                            [.] 0x0000000000099018
   1,94%  X                nvidia_drv.so                            [.] 0x000000000009901d
   1,72%  X                nvidia_drv.so                            [.] 0x0000000000099000
   1,48%  X                nvidia_drv.so                            [.] 0x0000000000056993
   1,38%  X                nvidia_drv.so                            [.] 0x000000000009900d
   1,37%  X                nvidia_drv.so                            [.] 0x0000000000056a20
   1,36%  X                nvidia_drv.so                            [.] 0x00000000000ac17c
   1,35%  X                nvidia_drv.so                            [.] 0x00000000000ac18c
   1,26%  X                nvidia_drv.so                            [.] 0x00000000000ac182
   1,20%  X                nvidia_drv.so                            [.] 0x000000000009898b
   1,18%  X                nvidia_drv.so                            [.] 0x0000000000099014
   0,99%  X                [kernel.kallsyms]                        [k] _raw_spin_lock_irqsave
   0,94%  X                nvidia_drv.so                            [.] 0x00000000000ac177
   0,89%  X                nvidia_drv.so                            [.] 0x00000000000569fe
   0,85%  X                nvidia_drv.so                            [.] 0x0000000000056a06
   0,83%  X                nvidia_drv.so                            [.] 0x0000000000098980
   0,81%  X                nvidia_drv.so                            [.] 0x0000000000099001
   0,78%  X                nvidia_drv.so                            [.] 0x00000000000569ef
   0,74%  X                nvidia_drv.so                            [.] 0x00000000000569a0
   0,71%  X                nvidia_drv.so                            [.] 0x0000000000056a14
   0,70%  X                nvidia_drv.so                            [.] 0x0000000000056a2a
   0,69%  X                nvidia_drv.so                            [.] 0x0000000000056996
   0,68%  X                nvidia_drv.so                            [.] 0x0000000000056914
   0,68%  X                nvidia_drv.so                            [.] 0x00000000000569ec
   0,67%  X                nvidia_drv.so                            [.] 0x0000000000056927
   0,67%  X                nvidia_drv.so                            [.] 0x0000000000056a28
   0,67%  X                nvidia_drv.so                            [.] 0x00000000000569a2
i-rinat commented 8 years ago

There a lot of samples point to X server and its nvidia_drv.so module. Perhaps, particular set of commands that freshplayerplugin sends makes it unhappy, and forces it to switch to processing on a CPU instead of GPU.

By the way, I've just learned a new trick with perf (there are still a lot of things to learn). If you still have perf.data file generated from running "perf record", it's possible to make histograms by calling

perf script -s /usr/lib/perf-core/scripts/python/event_analyzing_sample.py > log.txt

But I think, there will be the X server on the first line.

i-rinat commented 8 years ago

Since high CPU usage (and therefore stuttering) may be caused by running 3d operations in software mode, it's worth to try to disable 3d completely.

That could be done by adding enable_3d = 0 line to ~/.config/freshwrapper.conf file, and then restarting a browser.

PepperFlash from ChromeOS seems to require 3d to show any visual at all, so you have to use desktop version of PepperFlash to try this.

i-rinat commented 8 years ago

There is also another issue, #332, related to OpenGL|ES 2 performance. I believe, it mostly affects discrete video adapters.

hlavki commented 8 years ago

@i-rinat wrote:

But I think, there will be the X server on the first line.

Yes, you were right. X server is on the first line

            comm   number        histogram
==========================================
               X   270063     ###################
 plugin-containe    77036     #################
         swapper    66701     #################
         firefox    64531     ################
       JS Helper     6571     #############
         krunner     5776     #############
       ksysguard     5514     #############
      DOM Worker     5275     #############
   Socket Thread     4318     #############
        kwin_x11     2006     ###########
i-rinat commented 8 years ago

Had a chance to try on machine with nVidia drivers. Flash version of Youtube plays just fine, with no significant load in both inline and fullscreen modes. SWF on speedtest.net caused glXMakeCurrent to fail. That's not good, but it looks like Flash switches to CPU rendering, and shows visuals anyway (but ChromeOS version will definitely fail in that case). Don't know why is that happening.

Also hw accelerated decoding doesn't work — ffmpeg tells about errors in H.264 stream. I thought wrong buffer ordering was a real issue but it was fixed for some time. (And now I'm not sure that I tried the most recent version of code.)

Is there any sample URL with an example of slowness which I can try?

hlavki commented 8 years ago

Yes, I've tested it on stream.cz e.g. https://www.stream.cz/zrouti/10010590-treti-epizoda#nejnovejsi

i-rinat commented 8 years ago

https://www.stream.cz/zrouti/10010590-treti-epizoda#nejnovejsi

I see there HTML5 video player (Firefox 46.0.1). Not a sign of Flash.

hlavki commented 8 years ago

As I know, if you have installed flash with "always activate" option, then flash is preferred.

i-rinat commented 8 years ago

And now I'm not sure that I tried the most recent version of code.

Indeed, that was an old version. With recent version H.264 decoding is fine.

I also stumbled upon an error message about vsync. On my machine vsync was switched off automatically, but there could be cases where it can cause issues.

@hlavki, try to add enable_vsync = 0 to ~/.config/freshwrapper.conf.

hlavki commented 8 years ago

Sorry for delay, I had to wait for new Nvidia driver, because of incompatibility with kernel 4.6.x. But it was worth it. Now I use Kernel 4.6.2 and NVidia 367.27 and it works flawlessly even without enable_vsync = 0. So I think this is upstream problem.

i-rinat commented 8 years ago

and it works flawlessly

Let's hope that driver update will fix issue for @Elrondo46 too. :smiley: