intel / intel-vaapi-driver

VA-API user mode driver for Intel GEN Graphics family
https://01.org/linuxmedia
Other
303 stars 126 forks source link

Heavy Work Let X11 Reset Or GPU Hang #472

Closed zhoub closed 5 years ago

zhoub commented 5 years ago

Hi !

Still around Pentium N4200.

Run a very heavy OpenGL program, which decodes the MJPEG, and paint it with vaCopySurfaceGLX, intel_gpu_top might shows at least 50% busy.

Then run the h264encode, maybe 2 or 3 instances, to encode same video file, so that the intel_gpu_top might display the graphics core has a near 90% busy time.

Sometimes the screen might become black when start a new instance of h264encode, then all the programs crash without any warning, and back to the desktop login window.

Really have no idea about this. Just a memo here. Maybe somebody could share some tips to debug the driver or log the system ? Thanks very much.

shirokovoi commented 5 years ago

Is there something in dmesg.log?

zhoub commented 5 years ago

dmesg.log

[ 3072.766869] Hardware name: AAEON UP-APL01/UP-APL01, BIOS UPA1AM21 09/01/2017
[ 3072.766870] Call Trace:
[ 3072.766882]  dump_stack+0x63/0x8b
[ 3072.766886]  dump_header+0x77/0x285
[ 3072.766890]  ? security_capable_noaudit+0x4b/0x70
[ 3072.766892]  oom_kill_process+0x22e/0x450
[ 3072.766894]  out_of_memory+0x11d/0x4c0
[ 3072.766897]  __alloc_pages_slowpath+0xda2/0xe90
[ 3072.766901]  ? alloc_pages_current+0x6a/0xe0
[ 3072.766903]  __alloc_pages_nodemask+0x265/0x280
[ 3072.766906]  alloc_pages_current+0x6a/0xe0
[ 3072.766909]  __page_cache_alloc+0x86/0x90
[ 3072.766911]  filemap_fault+0x340/0x750
[ 3072.766914]  ? filemap_map_pages+0x185/0x3f0
[ 3072.766917]  ext4_filemap_fault+0x31/0x44
[ 3072.766920]  __do_fault+0x24/0xec
[ 3072.766922]  __handle_mm_fault+0xd05/0x11b0
[ 3072.766925]  handle_mm_fault+0xcc/0x1c0
[ 3072.766929]  __do_page_fault+0x260/0x500
[ 3072.766931]  do_page_fault+0x2e/0xf0
[ 3072.766934]  ? page_fault+0x2f/0x50
[ 3072.766935]  page_fault+0x45/0x50
[ 3072.766939] RIP: 0033:0x7fb8331844f0
[ 3072.766940] RSP: 002b:00007fb8312a9b98 EFLAGS: 00010206
[ 3072.766942] RAX: 0000000000000000 RBX: 00007fb824002340 RCX: 0000000000000000
[ 3072.766943] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
[ 3072.766944] RBP: 00007fb8312a9c00 R08: 0000000000c63380 R09: 0000000000000000
[ 3072.766946] R10: 00007fb8312a9890 R11: 0000000000000000 R12: 0000000000c63000
[ 3072.766947] R13: 0000000000000001 R14: 00007fb8312a9c10 R15: 0000000000c633b0
[ 3072.766956] Mem-Info:
[ 3072.766962] active_anon:67259 inactive_anon:1750746 isolated_anon:0
                active_file:23 inactive_file:66 isolated_file:0
                unevictable:8 dirty:2 writeback:0 unstable:0
                slab_reclaimable:11036 slab_unreclaimable:8187
                mapped:15875 shmem:1762011 pagetables:4712 bounce:0
                free:26636 free_pcp:7 free_cma:0
[ 3072.766966] Node 0 active_anon:269036kB inactive_anon:7002984kB active_file:92kB inactive_file:264kB unevictable:32kB isolated(anon):0kB isolated(file):0kB mapped:63500kB dirty:8kB writeback:0kB shmem:7048044kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[ 3072.766968] Node 0 DMA free:15896kB min:140kB low:172kB high:204kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15896kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 3072.766972] lowmem_reserve[]: 0 1314 7310 7310 7310
[ 3072.766976] Node 0 DMA32 free:35644kB min:12128kB low:15160kB high:18192kB active_anon:4580kB inactive_anon:1354296kB active_file:12kB inactive_file:4kB unevictable:0kB writepending:0kB present:1467688kB managed:1400716kB mlocked:0kB kernel_stack:80kB pagetables:100kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 3072.766980] lowmem_reserve[]: 0 0 5995 5995 5995
[ 3072.766984] Node 0 Normal free:55004kB min:55308kB low:69132kB high:82956kB active_anon:264456kB inactive_anon:5648272kB active_file:232kB inactive_file:172kB unevictable:32kB writepending:8kB present:6291456kB managed:6145576kB mlocked:32kB kernel_stack:5200kB pagetables:18748kB bounce:0kB free_pcp:28kB local_pcp:28kB free_cma:0kB
[ 3072.766988] lowmem_reserve[]: 0 0 0 0 0
[ 3072.766992] Node 0 DMA: 2*4kB (U) 2*8kB (U) 2*16kB (U) 3*32kB (U) 2*64kB (U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB
[ 3072.767006] Node 0 DMA32: 203*4kB (UM) 136*8kB (UM) 163*16kB (UME) 135*32kB (UME) 107*64kB (UM) 67*128kB (UME) 43*256kB (UME) 1*512kB (E) 0*1024kB 0*2048kB 0*4096kB = 35772kB
[ 3072.767021] Node 0 Normal: 2133*4kB (UME) 1438*8kB (UM) 1005*16kB (UMEH) 360*32kB (UMEH) 101*64kB (MEH) 2*128kB (H) 1*256kB (H) 0*512kB 1*1024kB (H) 0*2048kB 0*4096kB = 55636kB
[ 3072.767037] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 3072.767038] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 3072.767039] 1762129 total pagecache pages
[ 3072.767042] 46 pages in swap cache
[ 3072.767046] Swap cache stats: add 498597, delete 498551, find 138/182
[ 3072.767047] Free swap  = 0kB
[ 3072.767047] Total swap = 999420kB
[ 3072.767048] 1943782 pages RAM
[ 3072.767049] 0 pages HighMem/MovableOnly
[ 3072.767050] 53235 pages reserved
[ 3072.767050] 0 pages cma reserved
[ 3072.767051] 0 pages hwpoisoned
[ 3072.767052] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[ 3072.767068] [  273]     0   273     8849      779   106496        0             0 systemd-journal
[ 3072.767071] [  297]     0   297    11565      479   122880        0         -1000 systemd-udevd
[ 3072.767074] [  578]   100   578    25596       54   106496        0             0 systemd-timesyn
[ 3072.767076] [  710]     0   710     9019       69   106496        0             0 cron
[ 3072.767079] [  711]     0   711     7989      105   110592        0             0 bluetoothd
[ 3072.767081] [  716]     0   716    74613      230   217088        0             0 accounts-daemon
[ 3072.767083] [  720]     0   720     7156       98    98304        0             0 systemd-logind
[ 3072.767086] [  723]   106   723    11034      350   131072        0          -900 dbus-daemon
[ 3072.767089] [  769]     0   769   169069      840   372736        0             0 NetworkManager
[ 3072.767091] [  775]     0   775    43375      180   176128        0             0 thermald
[ 3072.767093] [  782]     0   782     1099       21    57344        0             0 acpid
[ 3072.767096] [  787]   104   787    64099      221   131072        0             0 rsyslogd
[ 3072.767098] [  791]   111   791    11227       98   131072        0             0 avahi-daemon
[ 3072.767100] [  881]   111   881    11196       82   122880        0             0 avahi-daemon
[ 3072.767102] [  901]     0   901     4893       74    81920        0             0 irqbalance
[ 3072.767105] [  926]     0   926    73049      245   212992        0             0 lightdm
[ 3072.767107] [  932]     0   932    76113      703   229376        0             0 polkitd
[ 3072.767109] [  959]     0   959    16378      172   167936        0         -1000 sshd
[ 3072.767112] [  999]     0   999    11036      198   131072        0             0 wpa_supplicant
[ 3072.767114] [ 1123]   118  1123    45886       70   126976        0             0 rtkit-daemon
[ 3072.767117] [ 1137]     0  1137    88558      351   249856        0             0 upowerd
[ 3072.767120] [ 1172]   113  1172    80167      723   258048        0             0 colord
[ 3072.767122] [ 1181]     0  1181     4033      219    77824        0             0 dhclient
[ 3072.767125] [ 1193] 65534  1193    14983       95   147456        0             0 dnsmasq
[ 3072.767127] [ 1365]   109  1365   113051      430   376832        0             0 whoopsie
[ 3072.767130] [ 1374]     0  1374     5751       34    73728        0             0 agetty
[ 3072.767132] [ 1375]     0  1375    24384      231   225280        0             0 x11vnc
[ 3072.767135] [ 1591]  1000  1591    11320      207   135168        0             0 systemd
[ 3072.767137] [ 1596]  1000  1596    52766      563   184320        0             0 (sd-pam)
[ 3072.767140] [ 2052]  1000  2052     2786       78    61440        0             0 ssh-agent
[ 3072.767142] [ 2150]     0  2150    95572      457   225280        0             0 udisksd
[ 3072.767144] [ 2575]     0  2575    25086      284   225280        0             0 cupsd
[ 3072.767146] [ 2576]     0  2576    68702      321   299008        0             0 cups-browsed
[ 3072.767149] [ 2893]     0  2893   120554    13332   589824        0             0 Xorg
[ 3072.767151] [ 3001]     0  3001    57576      186   217088        0             0 lightdm
[ 3072.767153] [ 3084]     0  3084    34401     6649   311296        0             0 x11vnc
[ 3072.767156] [ 3234]  1000  3234    53068      155   163840        2             0 gnome-keyring-d
[ 3072.767158] [ 3236]  1000  3236    13345      249   139264        0             0 upstart
[ 3072.767161] [ 3305]  1000  3305     9982       65   102400        0             0 upstart-udev-br
[ 3072.767163] [ 3315]  1000  3315    10845      256   122880        0             0 dbus-daemon
[ 3072.767165] [ 3327]  1000  3327    23353      158   167936        0             0 window-stack-br
[ 3072.767172] [ 3367]  1000  3367   108616     1231   471040        0             0 bamfdaemon
[ 3072.767175] [ 3377]  1000  3377     9965       79    94208        0             0 upstart-dbus-br
[ 3072.767177] [ 3379]  1000  3379     9965       79    94208        0             0 upstart-dbus-br
[ 3072.767180] [ 3381]  1000  3381    12088       99   114688        0             0 upstart-file-br
[ 3072.767182] [ 3383]  1000  3383    88421      184   184320        0             0 at-spi-bus-laun
[ 3072.767184] [ 3387]  1000  3387    70397      184   176128        0             0 gvfsd
[ 3072.767187] [ 3392]  1000  3392   104991      189   184320        0             0 gvfsd-fuse
[ 3072.767189] [ 3396]  1000  3396    10724      119   126976        0             0 dbus-daemon
[ 3072.767192] [ 3405]  1000  3405    51716      117   172032        0             0 at-spi2-registr
[ 3072.767195] [ 3414]  1000  3414    43401       62    90112        8             0 gpg-agent
[ 3072.767197] [ 3415]     0  3415     4033      223    73728        0             0 dhclient
[ 3072.767200] [ 3561]  1000  3561     1126       25    53248        0             0 sh
[ 3072.767202] [ 3572]  1000  3572   105936      509   471040        0             0 xfce4-session
[ 3072.767205] [ 3577]  1000  3577    11910      125   139264        0             0 xfconfd
[ 3072.767208] [ 3581]  1000  3581     2786       78    61440        0             0 ssh-agent
[ 3072.767210] [ 3585]  1000  3585   106758     1261   458752        0             0 xfwm4
[ 3072.767213] [ 3589]  1000  3589   112812      980   495616        0             0 xfce4-panel
[ 3072.767215] [ 3591]  1000  3591   106122      502   454656        0             0 Thunar
[ 3072.767217] [ 3593]  1000  3593   146382     6460   577536        0             0 xfdesktop
[ 3072.767220] [ 3594]  1000  3594   115405      634   466944        0             0 xfsettingsd
[ 3072.767222] [ 3595]  1000  3595   183033     4248   716800        0             0 psensor
[ 3072.767225] [ 3602]  1000  3602    17270      157   172032        0             0 xscreensaver
[ 3072.767227] [ 3613]  1000  3613    86292     1116   434176        0             0 polkit-gnome-au
[ 3072.767229] [ 3614]  1000  3614   138114      539   454656        0             0 xfce4-volumed
[ 3072.767232] [ 3616]  1000  3616   112138      509   458752        0             0 zeitgeist-datah
[ 3072.767234] [ 3617]  1000  3617   128546     1204   499712        0             0 update-notifier
[ 3072.767237] [ 3621]  1000  3621   157259     1638   565248        0             0 nm-applet
[ 3072.767239] [ 3628]  1000  3628    60921     4151   376832        0             0 applet.py
[ 3072.767242] [ 3633]  1000  3633   149033      200   245760        0             0 deja-dup-monito
[ 3072.767244] [ 3636]  1000  3636     1126       18    57344        0             0 sh
[ 3072.767246] [ 3641]  1000  3641   101375      209   155648        0             0 zeitgeist-daemo
[ 3072.767249] [ 3650]  1000  3650    77707      231   180224        0             0 zeitgeist-fts
[ 3072.767251] [ 3659]  1000  3659    92519      659   393216        0             0 pulseaudio
[ 3072.767253] [ 3687]  1000  3687    75830      330   229376        0             0 gvfs-udisks2-vo
[ 3072.767256] [ 3702]  1000  3702    66677      138   139264        0             0 gvfs-mtp-volume
[ 3072.767258] [ 3705]  1000  3705   106826     1051   438272        0             0 notify-osd
[ 3072.767261] [ 3711]  1000  3711    66149       92   139264        0             0 gvfs-goa-volume
[ 3072.767263] [ 3717]  1000  3717    69726      167   167936        0             0 gvfs-gphoto2-vo
[ 3072.767265] [ 3721]  1000  3721   103311      505   442368        0             0 panel-6-systray
[ 3072.767267] [ 3727]  1000  3727   105789      581   462848        0             0 panel-2-actions
[ 3072.767269] [ 3731]  1000  3731   102670      259   229376        0             0 gvfs-afc-volume
[ 3072.767272] [ 3752]  1000  3752    92683      256   204800        0             0 gvfsd-trash
[ 3072.767274] [ 3758]  1000  3758    48252       81   139264        0             0 gvfsd-metadata
[ 3072.767276] [ 3792]  1000  3792   143984     1923   565248        0             0 gnome-terminal-
[ 3072.767279] [ 3798]  1000  3798     7462      521    94208        0             0 bash
[ 3072.767281] [ 3814]  1000  3814     7463      522    90112        0             0 bash
[ 3072.767284] [ 3830]  1000  3830   397859     8796   884736        0             0 TestOpenGL
[ 3072.767287] [ 3974]  1000  3974     8163      190   106496        0             0 galaxy
[ 3072.767291] Out of memory: Kill process 2893 (Xorg) score 6 or sacrifice child
[ 3072.767321] Killed process 2893 (Xorg) total-vm:482216kB, anon-rss:8580kB, file-rss:4kB, shmem-rss:44744kB
[ 3072.784320] oom_reaper: reaped process 2893 (Xorg), now anon-rss:0kB, file-rss:0kB, shmem-rss:44936kB
[ 3073.887653] wlp4s0: deauthenticating from 00:11:32:91:5e:91 by local choice (Reason: 3=DEAUTH_LEAVING)
[ 3094.178517] wlp4s0: authenticate with 00:11:32:91:5e:92
[ 3094.182579] wlp4s0: send auth to 00:11:32:91:5e:92 (try 1/3)
[ 3094.186191] wlp4s0: authenticated
[ 3094.187627] wlp4s0: associate with 00:11:32:91:5e:92 (try 1/3)
[ 3094.189114] wlp4s0: RX AssocResp from 00:11:32:91:5e:92 (capab=0x411 status=0 aid=4)
[ 3094.190028] wlp4s0: associated
[ 3094.313898] wlp4s0: Limiting TX power to 17 (20 - 3) dBm as advertised by 00:11:32:91:5e:92
[ 3104.275362] wlp4s0: disconnect from AP 00:11:32:91:5e:92 for new auth to 00:11:32:91:5e:91
[ 3104.280250] wlp4s0: authenticate with 00:11:32:91:5e:91
[ 3104.284663] wlp4s0: send auth to 00:11:32:91:5e:91 (try 1/3)
[ 3104.292734] wlp4s0: authenticated
[ 3104.296188] wlp4s0: associate with 00:11:32:91:5e:91 (try 1/3)
[ 3104.300995] wlp4s0: RX ReassocResp from 00:11:32:91:5e:91 (capab=0x431 status=0 aid=2)
[ 3104.307023] wlp4s0: associated

It seems that there is an out of memory at the end, but I didn't see the TestOpenGL program has a leak through System Monitor.

Thanks !

zhoub commented 5 years ago

It seems like there is an memory leak in the decoding program, I would investigate later.