QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
539 stars 48 forks source link

OOMs in dom0 on recent kernels #3079

Closed rootkovska closed 5 years ago

rootkovska commented 7 years ago

Qubes OS version (e.g., R3.2):

In the last weeks I've been experiencing frequent Out Of Memory (OOM) killer actions in dom0, likely due to incorrect memory reporting by the dom0 kernel. This results in X in dom0 getting restarted. In order to recover the system, the user must:

  1. Log in again (this is easy, since X defaults to the login page after it gets restarted),
  2. Open a terminal and manually remove the /var/run/shm.id file,
  3. Log out and log in again (i.e. restart the X again). (After this all the gui daemons connect back to all the open windows, so no work is lost and

One problem is likely some (upstream?) bug in memory reporting, which results on Qubes memory balance (qmemman) incorrectly taking too much memory from dom0, leaving the OOM no other choice than to kill some memory-hungry process (and X is often a good candidate).

Another problem (Qubes-related) is the need to manually remove the shm.id file and restart the X. This should be done automatically, so that the user could just log in and have all the windows automatically restored (guid connected back).

rootkovska commented 7 years ago

BTW, a workaround suggested by Marek to increase the dom0-boost param in /etc/qubes/qmemman.conf does not work.

rootkovska commented 7 years ago

FWIW, I double checked that my qmemman got restarted after I increased the dom0-boost param to 650 MB, still getting OOM around once a day... :( I know @marmarek also experienced this, but I'm really surprised why not anyone else reports this problem? For years I've never had any stability problems related to OOM (nor anything else), and then suddenly recently this happens all the time...

marmarek commented 7 years ago

On Tue, Sep 05, 2017 at 09:13:51AM -0700, Joanna Rutkowska wrote:

Qubes OS version (e.g., R3.2):

In the last weeks I've been experiencing frequent Out Of Memory (OOM) killer actions in dom0, likely due to incorrect memory reporting by the dom0 kernel. This results in X in dom0 getting restarted. In order to recover the system, the user must:

  1. Log in again (this is easy, since X defaults to the login page after it gets restarted),
  2. Open a terminal and manually remove the /var/run/shm.id file,

Those two steps must be performed from text console, not GUI (X).

  1. Log out and log in again (i.e. restart the X again). (After this all the gui daemons connect back to all the open windows, so no work is lost and

Another problem (Qubes-related) is the need to manually remove the shm.id file and restart the X. This should be done automatically, so that the user could just log in and have all the windows automatically restored (guid connected back).

This is already fixed in Qubes 4.0. But the fix (or actually - the update procedure) is too fragile to backport it to 3.2, because it involve renaming /var/run/shm.id - which is referenced from various places. Normally X server remove this file on its termination, but it doesn't have a chance for it, when its killed with SIGKILL (and this is what OOM killer does). The reason to not override /var/run/shm.id at X server startup, is that it may belong another (still running!) X server instance. This is the case for clicking on "switch user" option, using default lightdm screen locker, and many more. Overriding /var/run/shm.id in that case will mean loosing GUI access to all AppVMs.

-- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

marmarek commented 7 years ago

One of mine crashes looks like this: https://gist.github.com/marmarek/1215529d91f44254d38e0f30d8e084de Interesting part:

Sep 03 17:34:20 dom0 kernel: xen-blkback: backend/vbd/117/51712: using 4 queues, protocol 1 (x86_64-abi) persistent grants
Sep 03 17:34:20 dom0 kernel: xen-blkback: backend/vbd/117/51728: using 4 queues, protocol 1 (x86_64-abi) persistent grants
Sep 03 17:34:20 dom0 kernel: xenwatch invoked oom-killer: gfp_mask=0x240c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0
Sep 03 17:34:20 dom0 kernel: xenwatch cpuset=/ mems_allowed=0
(...)
Sep 03 17:34:20 dom0 kernel: Call Trace:
Sep 03 17:34:20 dom0 kernel:  [<ffffffff813e45e3>] dump_stack+0x63/0x90
Sep 03 17:34:20 dom0 kernel:  [<ffffffff8123fd7d>] dump_header+0x7b/0x1fd
Sep 03 17:34:20 dom0 kernel:  [<ffffffff817daf9a>] ? _raw_spin_unlock_irqrestore+0x1a/0x20
Sep 03 17:34:20 dom0 kernel:  [<ffffffff811b6939>] oom_kill_process+0x219/0x3e0
Sep 03 17:34:20 dom0 kernel:  [<ffffffff811b6e33>] out_of_memory+0x143/0x4d0
Sep 03 17:34:20 dom0 kernel:  [<ffffffff811bc79d>] __alloc_pages_slowpath+0xbbd/0xc10
Sep 03 17:34:20 dom0 kernel:  [<ffffffff811bca46>] __alloc_pages_nodemask+0x256/0x2c0
Sep 03 17:34:20 dom0 kernel:  [<ffffffff81211105>] alloc_pages_current+0x95/0x140
Sep 03 17:34:20 dom0 kernel:  [<ffffffff811dd018>] kmalloc_order+0x18/0x40
Sep 03 17:34:20 dom0 kernel:  [<ffffffff811dd064>] kmalloc_order_trace+0x24/0xa0
Sep 03 17:34:20 dom0 kernel:  [<ffffffffc000e49f>] ? read_per_ring_refs+0x30f/0x400 [xen_blkback]
Sep 03 17:34:20 dom0 kernel:  [<ffffffffc000e2a4>] read_per_ring_refs+0x114/0x400 [xen_blkback]
Sep 03 17:34:20 dom0 kernel:  [<ffffffff813f1209>] ? snprintf+0x49/0x60
Sep 03 17:34:20 dom0 kernel:  [<ffffffffc000e8e1>] connect_ring.isra.7+0x351/0x4c0 [xen_blkback]
Sep 03 17:34:20 dom0 kernel:  [<ffffffffc000f23a>] frontend_changed+0xda/0x140 [xen_blkback]
Sep 03 17:34:20 dom0 kernel:  [<ffffffff814ceb83>] xenbus_otherend_changed+0x103/0x110

I think my original diagnosis was wrong - OOM happens some time after qmemman takes memory from dom0, not during that operation. Looking at call trace it happens when xen-blkfront in the VM connects to xen-blkback in dom0. Each of the OOM in my logs happens exactly at the same place. So, maybe it's about xen-blkback in 4.9.x kernel being more memory hungry. Or there is some memory leak. This also explains why qmemman config modification does not help.

@rootkovska can you check if it's xenwatch triggering OOM? Also, what was the system uptime at the time of OOM, counted in VM startups - in other word, what is max XID of running VMs? In my case minimal number at which OOM happened was 52.

rootkovska commented 7 years ago

The most recent OOM:

Sep 07 11:03:12 dom0 root[5474]: /etc/xen/scripts/block-snapshot: add XENBUS_PATH=backend/vbd/38/51712
Sep 07 11:03:12 dom0 root[5491]: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/38/51728
Sep 07 11:03:12 dom0 root[5516]: /etc/xen/scripts/block-snapshot: Writing backend/vbd/38/51712/node /dev/mapper/snapshot-fd01:22415761-fd01:22415856 to x
Sep 07 11:03:12 dom0 root[5518]: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/38/51744
Sep 07 11:03:12 dom0 root[5529]: /etc/xen/scripts/block-snapshot: Writing backend/vbd/38/51712/physical-device fd:4 to xenstore.
Sep 07 11:03:13 dom0 root[5542]: /etc/xen/scripts/block-snapshot: Writing backend/vbd/38/51712/hotplug-status connected to xenstore.
Sep 07 11:03:13 dom0 root[5555]: /etc/xen/scripts/block: add XENBUS_PATH=backend/vbd/38/51760
Sep 07 11:03:13 dom0 root[5625]: /etc/xen/scripts/block: Writing backend/vbd/38/51728/node /dev/loop24 to xenstore.
Sep 07 11:03:13 dom0 root[5635]: /etc/xen/scripts/block: Writing backend/vbd/38/51728/physical-device 7:18 to xenstore.
Sep 07 11:03:13 dom0 root[5642]: /etc/xen/scripts/block: Writing backend/vbd/38/51728/hotplug-status connected to xenstore.
Sep 07 11:03:13 dom0 root[5673]: /etc/xen/scripts/block: Writing backend/vbd/38/51744/node /dev/loop25 to xenstore.
Sep 07 11:03:13 dom0 root[5684]: /etc/xen/scripts/block: Writing backend/vbd/38/51744/physical-device 7:19 to xenstore.
Sep 07 11:03:13 dom0 root[5690]: /etc/xen/scripts/block: Writing backend/vbd/38/51744/hotplug-status connected to xenstore.
Sep 07 11:03:14 dom0 root[5859]: /etc/xen/scripts/block: Writing backend/vbd/38/51760/node /dev/loop3 to xenstore.
Sep 07 11:03:14 dom0 root[5863]: /etc/xen/scripts/block: Writing backend/vbd/38/51760/physical-device 7:3 to xenstore.
Sep 07 11:03:14 dom0 root[5865]: /etc/xen/scripts/block: Writing backend/vbd/38/51760/hotplug-status connected to xenstore.
Sep 07 11:03:16 dom0 kernel: xen-blkback: backend/vbd/38/51712: using 4 queues, protocol 1 (x86_64-abi) persistent grants
Sep 07 11:03:16 dom0 kernel: xen-blkback: backend/vbd/38/51728: using 4 queues, protocol 1 (x86_64-abi) persistent grants
Sep 07 11:03:16 dom0 kernel: xen-blkback: backend/vbd/38/51744: using 4 queues, protocol 1 (x86_64-abi) persistent grants
Sep 07 11:03:17 dom0 kernel: xen-blkback: backend/vbd/38/51760: using 4 queues, protocol 1 (x86_64-abi) persistent grants
Sep 07 11:03:27 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|__GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0
Sep 07 11:03:27 dom0 kernel: Xorg cpuset=/ mems_allowed=0
Sep 07 11:03:27 dom0 kernel: CPU: 2 PID: 7916 Comm: Xorg Tainted: G     U          4.9.45-21.pvops.qubes.x86_64 #1
Sep 07 11:03:27 dom0 kernel: Hardware name: Dell Inc. XPS 13 9343/0310JH, BIOS A04 05/15/2015

The previous one:

Sep 04 13:05:37 dom0 qmemman.systemstate[1756]: dom '0' still hold more memory than have assigned (2619953152 > 2598224688)
Sep 04 13:05:39 dom0 qmemman.systemstate[1756]: dom '0' didnt react to memory request (holds 2619953152, requested balloon down to 2558453125)
Sep 04 13:06:05 dom0 kernel: audit_printk_skb: 15 callbacks suppressed
Sep 04 13:07:22 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|__GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0
Sep 04 13:07:22 dom0 kernel: CPU: 3 PID: 3112 Comm: Xorg Tainted: G     U  W       4.9.45-21.pvops.qubes.x86_64 #1
Sep 04 13:07:22 dom0 kernel: Hardware name: Dell Inc. XPS 13 9343/0310JH, BIOS A04 05/15/2015
marmarek commented 7 years ago

I've tried to reproduce it on a testing machine, but failed. Either starting large number of VMs (simultaneously or sequentially) does not trigger it there. Also inducing large memory usage in some of them does not trigger the problem. I think having xl list output (can be with obscured VM names) from the crash time would be useful.

marmarek commented 7 years ago

@rootkovska do you have call trace somewhere there? But I see it's different on your system - "Xorg invoked oom-killer" vs "xenwatch invoked oom-killer". In any case, it is just after xen-blkfront connects to xen-blkback.

cyrinux commented 7 years ago

Just for say I have often the same problem on my T450s , with many VM running or less sometimes. Happy to read the workaround for recover my windows without rebooting. I will try this.

On ven. 8 sept. 2017 à 12:35 Marek Marczykowski-Górecki < notifications@github.com> wrote:

@rootkovska https://github.com/rootkovska do you have call trace somewhere there? But I see it's different on your system - "Xorg invoked oom-killer" vs "xenwatch invoked oom-killer". In any case, it is just after xen-blkfront connects to xen-blkback.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/QubesOS/qubes-issues/issues/3079#issuecomment-328069516, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfrq54GHQjws73zZLT2XDBkDYd0E5-nks5sgRhtgaJpZM4PNL22 .

rootkovska commented 7 years ago

Here's some more for your debugging pleasure :)

Sep 07 11:03:27 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|__GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0
Sep 07 11:03:27 dom0 kernel: CPU: 2 PID: 7916 Comm: Xorg Tainted: G     U          4.9.45-21.pvops.qubes.x86_64 #1
Sep 07 11:03:27 dom0 kernel: Hardware name: Dell Inc. XPS 13 9343/0310JH, BIOS A04 05/15/2015
Sep 07 11:03:27 dom0 kernel:  ffffc90003883500 ffffffff813e62c3 ffffc900038836d0 ffff88009e6fd880
Sep 07 11:03:27 dom0 kernel:  ffffc90003883578 ffffffff812414bd 0000000000000003 0000000000000000
Sep 07 11:03:27 dom0 kernel:  0000000000000000 0000000000000000 ffff88011b249b20 0000000000000001
Sep 07 11:03:27 dom0 kernel: Call Trace:
Sep 07 11:03:27 dom0 kernel:  [<ffffffff813e62c3>] dump_stack+0x63/0x90
Sep 07 11:03:27 dom0 kernel:  [<ffffffff812414bd>] dump_header+0x7b/0x1fd
Sep 07 11:03:27 dom0 kernel:  [<ffffffff817de41a>] ? _raw_spin_unlock_irqrestore+0x1a/0x20
Sep 07 11:03:27 dom0 kernel:  [<ffffffff811b7ef9>] oom_kill_process+0x219/0x3e0
Sep 07 11:03:27 dom0 kernel:  [<ffffffff811b83f3>] out_of_memory+0x143/0x4d0
Sep 07 11:03:27 dom0 kernel:  [<ffffffff811bdd5d>] __alloc_pages_slowpath+0xbbd/0xc10
Sep 07 11:03:27 dom0 kernel:  [<ffffffff811be006>] __alloc_pages_nodemask+0x256/0x2c0
Sep 07 11:03:27 dom0 kernel:  [<ffffffff81212865>] alloc_pages_current+0x95/0x140
Sep 07 11:03:27 dom0 kernel:  [<ffffffff811de5e8>] kmalloc_order+0x18/0x40
Sep 07 11:03:27 dom0 kernel:  [<ffffffff811de634>] kmalloc_order_trace+0x24/0xa0
Sep 07 11:03:27 dom0 kernel:  [<ffffffff8121fe73>] __kmalloc+0x1f3/0x210
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc019fb9e>] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc019fbb7>] alloc_gen8_temp_bitmaps+0x47/0x80 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc019fc90>] gen8_alloc_va_range_3lvl+0xa0/0x9b0 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffff814da513>] ? xen_swiotlb_map_sg_attrs+0x1f3/0x2e0
Sep 07 11:03:27 dom0 kernel:  [<ffffffff814da4d3>] ? xen_swiotlb_map_sg_attrs+0x1b3/0x2e0
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc01a093e>] gen8_alloc_va_range+0x23e/0x490 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc01a250d>] i915_vma_bind+0x6d/0x120 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc01a6a91>] __i915_vma_do_pin+0x261/0x480 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc0197374>] i915_gem_execbuffer_reserve_vma.isra.42+0x144/0x1b0 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc0197775>] i915_gem_execbuffer_reserve.isra.43+0x395/0x3c0 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc0198ef1>] i915_gem_do_execbuffer.isra.48+0x511/0x16b0 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc019a442>] i915_gem_execbuffer2+0xa2/0x1e0 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc0094b3b>] drm_ioctl+0x21b/0x4d0 [drm]
Sep 07 11:03:27 dom0 kernel:  [<ffffffffc019a3a0>] ? i915_gem_execbuffer+0x310/0x310 [i915]
Sep 07 11:03:27 dom0 kernel:  [<ffffffff8110bf4d>] ? enqueue_hrtimer+0x3d/0x90
Sep 07 11:03:27 dom0 kernel:  [<ffffffff817de41a>] ? _raw_spin_unlock_irqrestore+0x1a/0x20
Sep 07 11:03:27 dom0 kernel:  [<ffffffff8110c8eb>] ? hrtimer_start_range_ns+0x1db/0x3f0
Sep 07 11:03:27 dom0 kernel:  [<ffffffff8125a5d1>] do_vfs_ioctl+0xa1/0x5e0
Sep 07 11:03:27 dom0 kernel:  [<ffffffff8114772b>] ? __audit_syscall_entry+0xab/0xf0
Sep 07 11:03:27 dom0 kernel:  [<ffffffff8100346d>] ? syscall_trace_enter+0x1bd/0x2d0
Sep 07 11:03:27 dom0 kernel:  [<ffffffff8125ab89>] SyS_ioctl+0x79/0x90
Sep 07 11:03:27 dom0 kernel:  [<ffffffff81003c17>] do_syscall_64+0x67/0x180
Sep 07 11:03:27 dom0 kernel:  [<ffffffff817de92b>] entry_SYSCALL64_slow_path+0x25/0x25
Sep 07 11:03:27 dom0 kernel: Mem-Info:
Sep 07 11:03:27 dom0 kernel: active_anon:100522 inactive_anon:104313 isolated_anon:0
                              active_file:82715 inactive_file:95342 isolated_file:0
                              unevictable:63 dirty:1762 writeback:0 unstable:0
                              slab_reclaimable:13452 slab_unreclaimable:25091
                              mapped:40201 shmem:123734 pagetables:7518 bounce:0
                              free:9739 free_pcp:185 free_cma:0
Sep 07 11:03:27 dom0 kernel: Node 0 active_anon:402088kB inactive_anon:417252kB active_file:330860kB inactive_file:381368kB unevictable:252kB isolated(anon):0kB isolated(file):0kB mapped
Sep 07 11:03:27 dom0 kernel: Node 0 DMA free:15608kB min:28kB low:40kB high:52kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:16kB unevictable:0kB writepending:0kB pre
Sep 07 11:03:27 dom0 kernel: lowmem_reserve[]: 0 3362 3896 3896
Sep 07 11:03:27 dom0 kernel: Node 0 DMA32 free:20704kB min:6876kB low:10316kB high:13756kB active_anon:360864kB inactive_anon:334552kB active_file:242972kB inactive_file:302868kB unevict
Sep 07 11:03:27 dom0 kernel: lowmem_reserve[]: 0 0 534 534
Sep 07 11:03:27 dom0 kernel: Node 0 Normal free:2644kB min:1092kB low:1636kB high:2180kB active_anon:41224kB inactive_anon:82760kB active_file:87932kB inactive_file:78516kB unevictable:8
Sep 07 11:03:27 dom0 kernel: lowmem_reserve[]: 0 0 0 0
Sep 07 11:03:27 dom0 kernel: Node 0 DMA: 2*4kB (M) 2*8kB (UM) 4*16kB (ME) 3*32kB (UE) 1*64kB (M) 4*128kB (UME) 2*256kB (ME) 2*512kB (ME) 3*1024kB (UME) 1*2048kB (E) 2*4096kB (M) = 15608k
Sep 07 11:03:27 dom0 kernel: Node 0 DMA32: 2300*4kB (ME) 917*8kB (UME) 261*16kB (UME) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 20712kB
Sep 07 11:03:27 dom0 kernel: Node 0 Normal: 99*4kB (H) 66*8kB (H) 36*16kB (H) 18*32kB (H) 7*64kB (H) 1*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2652kB
Sep 07 11:03:27 dom0 kernel: 305827 total pagecache pages
Sep 07 11:03:27 dom0 kernel: 4004 pages in swap cache
Sep 07 11:03:27 dom0 kernel: Swap cache stats: add 270892, delete 266888, find 271422/295874
Sep 07 11:03:27 dom0 kernel: Free swap  = 7619168kB
Sep 07 11:03:27 dom0 kernel: Total swap = 7864316kB
Sep 07 11:03:27 dom0 kernel: 1048575 pages RAM
Sep 07 11:03:27 dom0 kernel: 0 pages HighMem/MovableOnly
Sep 07 11:03:27 dom0 kernel: 537312 pages reserved
Sep 07 11:03:27 dom0 kernel: 0 pages hwpoisoned
Sep 07 11:03:27 dom0 kernel: Out of memory: Kill process 7916 (Xorg) score 15 or sacrifice child
Sep 07 11:03:27 dom0 kernel: Killed process 7916 (Xorg) total-vm:768604kB, anon-rss:33236kB, file-rss:7332kB, shmem-rss:103828kB
Sep 07 11:03:27 dom0 kernel: do_trap: 15 callbacks suppressed
Sep 07 11:03:27 dom0 kernel:  in libglib-2.0.so.0.4600.2[70e163e67000+137000]
Sep 07 11:03:27 dom0 org.freedesktop.Notifications[7955]: xfce4-notifyd: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Sep 07 11:03:27 dom0 org.gtk.vfs.Daemon[7955]: A connection to the bus can't be made
Sep 07 11:03:27 dom0 org.a11y.atspi.Registry[8038]: XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
Sep 07 11:03:27 dom0 org.a11y.atspi.Registry[8038]: after 53893 requests (53893 known processed) with 0 events remaining.
Sep 07 11:03:28 dom0 libvirtd[1786]: End of file while reading data: Input/output error
marmarek commented 7 years ago

Since I rebooted the system two days ago, I'm unable to reproduce this. @rootkovska does it still happen to you?

andrewdavidwong commented 7 years ago

This is still happening to me on a T450s on R3.2. Very unpleasant bug.

marmarek commented 6 years ago

Happened to me today, twice.

marmarek commented 6 years ago

@andrewdavidwong can you post dom0 kernel messages from that?

andrewdavidwong commented 6 years ago

@marmarek: Sure. How can I access them?

marmarek commented 6 years ago

dmesg in dom0, or if you rebooted, journalctl (see various options for filtering based on time).

marmarek commented 6 years ago

Also, I'm trying another workaround: change oom_score_adj for X server, so even if OOM happen, it will not choose X server to kill.

echo -100 | sudo tee /proc/`pidof X`/oom_score_adj

This needs to be done after each X server startup. I can't yet confirm if that's effective enough.

andrewdavidwong commented 6 years ago
Nov 03 21:58:33 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|__GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0
Nov 03 21:58:33 dom0 kernel: Xorg cpuset=/ mems_allowed=0
Nov 03 21:58:33 dom0 kernel: CPU: 0 PID: 3483 Comm: Xorg Tainted: G     U          4.9.56-21.pvops.qubes.x86_64 #1
Nov 03 21:58:33 dom0 kernel: Hardware name: LENOVO (...)
Nov 03 21:58:33 dom0 kernel:  ffffc90008837500 ffffffff813e8e03 ffffc900088376d0 ffff8801008d3b00
Nov 03 21:58:33 dom0 kernel:  ffffc90008837578 ffffffff81241efd 0000000000000003 0000000000000000
Nov 03 21:58:33 dom0 kernel:  0000000000000000 0000000000000000 ffff880135659b20 0000000000000001
Nov 03 21:58:33 dom0 kernel: Call Trace:
Nov 03 21:58:33 dom0 kernel:  [<ffffffff813e8e03>] dump_stack+0x63/0x90
Nov 03 21:58:33 dom0 kernel:  [<ffffffff81241efd>] dump_header+0x7b/0x1fd
Nov 03 21:58:33 dom0 kernel:  [<ffffffff817e0baa>] ? _raw_spin_unlock_irqrestore+0x1a/0x20
Nov 03 21:58:33 dom0 kernel:  [<ffffffff811b8939>] oom_kill_process+0x219/0x3e0
Nov 03 21:58:33 dom0 kernel:  [<ffffffff811b8e33>] out_of_memory+0x143/0x4d0
Nov 03 21:58:33 dom0 kernel:  [<ffffffff811be78a>] __alloc_pages_slowpath+0xbaa/0xbf0
Nov 03 21:58:33 dom0 kernel:  [<ffffffff811bea26>] __alloc_pages_nodemask+0x256/0x2c0
Nov 03 21:58:33 dom0 kernel:  [<ffffffff81213295>] alloc_pages_current+0x95/0x140
Nov 03 21:58:33 dom0 kernel:  [<ffffffff811deff8>] kmalloc_order+0x18/0x40
Nov 03 21:58:33 dom0 kernel:  [<ffffffff811df044>] kmalloc_order_trace+0x24/0xa0
Nov 03 21:58:33 dom0 kernel:  [<ffffffff812208a3>] __kmalloc+0x1f3/0x210
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc0189b9e>] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc0189bb7>] alloc_gen8_temp_bitmaps+0x47/0x80 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc0189c90>] gen8_alloc_va_range_3lvl+0xa0/0x9b0 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffff811b3e4b>] ? find_lock_entry+0x5b/0x120
Nov 03 21:58:33 dom0 kernel:  [<ffffffff8121ecad>] ? __slab_alloc+0x2d/0x40
Nov 03 21:58:33 dom0 kernel:  [<ffffffff814dd073>] ? xen_swiotlb_map_sg_attrs+0x1b3/0x2e0
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc018a93e>] gen8_alloc_va_range+0x23e/0x490 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc018c50d>] i915_vma_bind+0x6d/0x120 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc0190a91>] __i915_vma_do_pin+0x261/0x480 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc0181374>] i915_gem_execbuffer_reserve_vma.isra.42+0x144/0x1b0 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc0181775>] i915_gem_execbuffer_reserve.isra.43+0x395/0x3c0 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc0182ef1>] i915_gem_do_execbuffer.isra.48+0x511/0x16b0 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffff813ea72a>] ? idr_alloc+0x9a/0x110
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc0184442>] i915_gem_execbuffer2+0xa2/0x1e0 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc0051b3b>] drm_ioctl+0x21b/0x4d0 [drm]
Nov 03 21:58:33 dom0 kernel:  [<ffffffffc01843a0>] ? i915_gem_execbuffer+0x310/0x310 [i915]
Nov 03 21:58:33 dom0 kernel:  [<ffffffff813f23d4>] ? timerqueue_del+0x24/0x70
Nov 03 21:58:33 dom0 kernel:  [<ffffffff8110c44c>] ? __remove_hrtimer+0x3c/0x90
Nov 03 21:58:33 dom0 kernel:  [<ffffffff8125b051>] do_vfs_ioctl+0xa1/0x5e0
Nov 03 21:58:33 dom0 kernel:  [<ffffffff8110dcd2>] ? do_setitimer+0x142/0x230
Nov 03 21:58:33 dom0 kernel:  [<ffffffff81147a0b>] ? __audit_syscall_entry+0xab/0xf0
Nov 03 21:58:33 dom0 kernel:  [<ffffffff8100346d>] ? syscall_trace_enter+0x1bd/0x2d0
Nov 03 21:58:33 dom0 kernel:  [<ffffffff8125b609>] SyS_ioctl+0x79/0x90
Nov 03 21:58:33 dom0 kernel:  [<ffffffff81003c17>] do_syscall_64+0x67/0x180
Nov 03 21:58:33 dom0 kernel:  [<ffffffff817e10eb>] entry_SYSCALL64_slow_path+0x25/0x25
Nov 03 21:58:33 dom0 kernel: Mem-Info:
Nov 03 21:58:33 dom0 kernel: active_anon:256076 inactive_anon:93796 isolated_anon:0
                              active_file:335245 inactive_file:126155 isolated_file:0
                              unevictable:170 dirty:188 writeback:0 unstable:0
                              slab_reclaimable:20140 slab_unreclaimable:31779
                              mapped:65136 shmem:145085 pagetables:14847 bounce:0
                              free:9657 free_pcp:312 free_cma:0
Nov 03 21:58:33 dom0 kernel: Node 0 active_anon:1024304kB inactive_anon:375184kB active_file:1340980kB inactive_file:504620kB unevictable:680kB isolated(anon):0kB isolated(file):0kB mapped:260544kB dirty:752kB writeback:0kB shmem:580340kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB pages_scanned:32 all_unreclaimable? no
Nov 03 21:58:33 dom0 kernel: Node 0 DMA free:15600kB min:28kB low:40kB high:52kB active_anon:68kB inactive_anon:0kB active_file:8kB inactive_file:156kB unevictable:0kB writepending:0kB present:15984kB managed:15900kB mlocked:0kB slab_reclaimable:48kB slab_unreclaimable:20kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Nov 03 21:58:33 dom0 kernel: lowmem_reserve[]: 0 2939 3896 3896
Nov 03 21:58:33 dom0 kernel: Node 0 DMA32 free:20104kB min:6012kB low:9020kB high:12028kB active_anon:644904kB inactive_anon:302184kB active_file:1136472kB inactive_file:430896kB unevictable:512kB writepending:536kB present:3131348kB managed:3032808kB mlocked:512kB slab_reclaimable:55488kB slab_unreclaimable:67544kB kernel_stack:7536kB pagetables:49152kB bounce:0kB free_pcp:864kB local_pcp:388kB free_cma:0kB
Nov 03 21:58:33 dom0 kernel: lowmem_reserve[]: 0 0 956 956
Nov 03 21:58:33 dom0 kernel: Node 0 Normal free:2924kB min:1956kB low:2932kB high:3908kB active_anon:379332kB inactive_anon:73000kB active_file:204500kB inactive_file:73572kB unevictable:168kB writepending:40kB present:1046968kB managed:979848kB mlocked:168kB slab_reclaimable:25024kB slab_unreclaimable:59552kB kernel_stack:6064kB pagetables:10236kB bounce:0kB free_pcp:384kB local_pcp:20kB free_cma:0kB
Nov 03 21:58:33 dom0 kernel: lowmem_reserve[]: 0 0 0 0
Nov 03 21:58:33 dom0 kernel: Node 0 DMA: 6*4kB (UM) 1*8kB (U) 3*16kB (UMH) 3*32kB (UH) 3*64kB (UH) 1*128kB (U) 3*256kB (UMH) 2*512kB (MH) 3*1024kB (UMH) 1*2048kB (E) 2*4096kB (M) = 15600kB
Nov 03 21:58:33 dom0 kernel: Node 0 DMA32: 1060*4kB (MEH) 975*8kB (UMEH) 462*16kB (UMEH) 15*32kB (H) 2*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 20040kB
Nov 03 21:58:33 dom0 kernel: Node 0 Normal: 523*4kB (UMEH) 38*8kB (UMEH) 16*16kB (UH) 6*32kB (H) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2844kB
Nov 03 21:58:33 dom0 kernel: 607006 total pagecache pages
Nov 03 21:58:33 dom0 kernel: 512 pages in swap cache
Nov 03 21:58:33 dom0 kernel: Swap cache stats: add 14293, delete 13781, find 151909/152125
Nov 03 21:58:33 dom0 kernel: Free swap  = 10429640kB
Nov 03 21:58:33 dom0 kernel: Total swap = 10485756kB
Nov 03 21:58:33 dom0 kernel: 1048575 pages RAM
Nov 03 21:58:33 dom0 kernel: 0 pages HighMem/MovableOnly
Nov 03 21:58:33 dom0 kernel: 41436 pages reserved
Nov 03 21:58:33 dom0 kernel: 0 pages hwpoisoned
Nov 03 21:58:33 dom0 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Nov 03 21:58:33 dom0 kernel: [ 1010]     0  1010    19717     5357      40       3       38             0 systemd-journal
(...)
Nov 03 21:58:33 dom0 kernel: Out of memory: Kill process 3483 (Xorg) score 15 or sacrifice child
Nov 03 21:58:33 dom0 kernel: Killed process 3483 (Xorg) total-vm:594472kB, anon-rss:57668kB, file-rss:8092kB, shmem-rss:167796kB
HW42 commented 6 years ago

FWIW, I had OOM crashed of dom0 about three times or so. But only very seldom (not more than once per month). The system was always running for a longer time and there was quite some memory pressure overall.


Do you know when you observed this for the first time?


The logs are interesting. This seems not to be a classic OOM situation. There is free swap space and dom0 still has about 4 GiB total memory. Compared to the total memory Xorg also doesn't use that much.

And indeed all logs say __GFP_COMP and the backtrace is somewhere from the kernel (i915 or xen-blkback). So I guess the kernel runs out of continuous memory (according to this article __GFP_COMP allocations are always continuous).


Regarding further debugging. It's probably a good idea to enable more logging in qmemmand so we can see when it triggers ballooning. So for example create /etc/systemd/system/qubes-qmemman.service.d/debug.conf with:

[Service]
ExecStart=
ExecStart=/usr/lib/qubes/qmemman_daemon.py -d
marmarek commented 6 years ago

If that's about continuous memory, and in my case about xen-blkback module, does it make sense to limit max_ring_page_order parameter? It will have performance implications though...

HW42 commented 6 years ago

If that's about continuous memory, and in my case about xen-blkback module, does it make sense to limit max_ring_page_order parameter? It will have performance implications though...

Not sure if this will help since it seems kmalloc() uses compound pages for every "bigger" (with our config > 8192 B) allocation, so I guess there are many places using this.

HW42 commented 6 years ago

The reason to not override /var/run/shm.id at X server startup, is that it may belong another (still running!) X server instance.

What about checking if another X server is running at startup?

marmarek commented 6 years ago

What about checking if another X server is running at startup?

Any idea how do to that easy way? From what I know, X server does not have PID file, so it would require scraping over /proc, so sth like this. Also remember the code is running in constructor of LD_PRELOADed library, so there is very limited info about the current X server instance.

marmarek commented 6 years ago

Adjusting oom_score_adj for X server made kernel killing qfile-daemon-dvm and systemd-journald (the problem happened when starting DispVM). Much less severe consequences :)

Here is full log of the incident: https://gist.github.com/marmarek/1cea76a2649c18b12187b09cc6ef7c7e

marmarek commented 6 years ago

Just happened again, this time it killed a bunch of xfce4 related processes, including xfdesktop xfce4-panel children (all panel plugins), panel itself. And xfwm4. This means it required a lot of memory. Backtrack is the same. If another log could bring something new, just say. Maybe, if really that much of dom0 memory is needed at VM startup, we should calculate it, and make sure that dom0 do have that much free memory before starting a VM? @HW42

marmarek commented 6 years ago

Or, maybe this is also about kernel config? Something about how memory is allocated (preference of continuous area? how much memory is keep free (swappiness?)?).

HW42 commented 6 years ago

Just happened again, this time it killed a bunch of xfce4 related processes, including xfdesktop xfce4-panel children (all panel plugins), panel itself. And xfwm4. This means it required a lot of memory. Backtrack is the same. If another log could bring something new, just say.

Or maybe the killed processes didn't used enough continuous memory. But not sure if that's realistic.

Maybe, if really that much of dom0 memory is needed at VM startup, we should calculate it, and make sure that dom0 do have that much free memory before starting a VM? @HW42

Does this only happens when you start a new VM?

Or, maybe this is also about kernel config? Something about how memory is allocated (preference of continuous area? how much memory is keep free (swappiness?)?).

I guess this is more likely. Especially since this problem appeared recent-ish. Maybe dom0 linux 4.4 -> 4.9? Do we know when this was observed the first time?

marmarek commented 6 years ago

On Wed, Nov 15, 2017 at 06:59:39AM +0000, HW42 wrote:

Maybe, if really that much of dom0 memory is needed at VM startup, we should calculate it, and make sure that dom0 do have that much free memory before starting a VM? @HW42

Does this only happens when you start a new VM?

Yes, at least for me.

Or, maybe this is also about kernel config? Something about how memory is allocated (preference of continuous area? how much memory is keep free (swappiness?)?).

I guess this is more likely. Especially since this problem appeared recent-ish. Maybe dom0 linux 4.4 -> 4.9? Do we know when this was observed the first time?

I'm almost sure it started after upgrade to 4.9 in dom0. But it might also started to happening after VM kernel upgrade - I've done that at very similar time.

-- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

andrewdavidwong commented 6 years ago

(Possible) qubes-users thread on this bug: https://groups.google.com/d/topic/qubes-users/h070Y5NXJFo/discussion

3hhh commented 6 years ago

I'm very likely also affected by this one on 4.0rc2.

I can re-produce it this way:

  1. Start a few VMs (2-3 apart from the sys- VMs I'd say). They may idle.
  2. In dom0: xscreensaver-command -lock (assuming the daemon is running)
  3. Go away for 30-60 mins.
  4. Notice that the X server restarted (your screen is unlocked or your dm login is shown depending on your config).

This happens 1-2 times per day on my T530 with 16 GB RAM, if I use it. I more rarely get the crashes mentioned in the qubes-users thread (not 100% sure whether it's the same issue).

4.9.45-21 dom0 Kernel.

Side note: free -mh in dom0 only shows 4 GB RAM, but I guess that's because Xen allocated the remaining 12 GB to the VMs?

3hhh commented 6 years ago

P.S.: I didn't find anything interesting in sudo journalctl -b0 so far apart from maybe the following which occurs every few hours:

Nov 29 18:32:16 dom0 qmemman.systemstate[2054]: Xen free = 1667727686 too small for satisfy assignments! assigned_but_unused=1680891536, domdict={'13': {'id': '13', 'last_target': 150994944, 'memory_current': 150994944, 'mem_used': None, 'memory_actual': 150994944, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 150994944}, '14': {'id': '14', 'last_target': 928131916, 'memory_current': 945352704, 'mem_used': 332644352, 'memory_actual': 945352704, 'no_progress': True, 'slow_memset_react': False, 'memory_maximum': 4194304000}, '3': {'id': '3', 'last_target': 1108653483, 'memory_current': 1070624768, 'mem_used': 269041664, 'memory_actual': 1070625731, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 4194304000}, '1': {'id': '1', 'last_target': 419430400, 'memory_current': 417189888, 'mem_used': None, 'memory_actual': 419430400, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 419430400}, '0': {'id': '0', 'last_target': 5772307649, 'memory_current': 4172697600, 'mem_used': 1322520576, 'memory_actual': 5848884991, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 17100177408}, '4': {'id': '4', 'last_target': 1102621732, 'memory_current': 1064734720, 'mem_used': 267378688, 'memory_actual': 1064737563, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 4194304000}, '5': {'id': '5', 'last_target': 1048576000, 'memory_current': 1048334336, 'mem_used': 310456320, 'memory_actual': 1048576000, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 1048576000}, '15': {'id': '15', 'last_target': 150994944, 'memory_current': 150994944, 'mem_used': None, 'memory_actual': 150994944, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 150994944}, '2': {'id': '2', 'last_target': 419430400, 'memory_current': 416546816, 'mem_used': 204242944, 'memory_actual': 419430400, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 419430400}, '9': {'id': '9', 'last_target': 150994944, 'memory_current': 150994944, 'mem_used': None, 'memory_actual': 150994944, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 150994944}, '11': {'id': '11', 'last_target': 150994944, 'memory_current': 150994944, 'mem_used': None, 'memory_actual': 150994944, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 150994944}, '6': {'id': '6', 'last_target': 150994944, 'memory_current': 150994944, 'mem_used': None, 'memory_actual': 150994944, 'no_progress': False, 'slow_memset_react': False, 'memory_maximum': 150994944}, '8': {'id': '8', 'last_target': 1511855526, 'memory_current': 1529077760, 'mem_used': 349790208, 'memory_actual': 1529077760, 'no_progress': True, 'slow_memset_react': False, 'memory_maximum': 4194304000}, '12': {'id': '12', 'last_target': 2009606869, 'memory_current': 2026827776, 'mem_used': 504201216, 'memory_actual': 2026827776, 'no_progress': True, 'slow_memset_react': False, 'memory_maximum': 4194304000}, '10': {'id': '10', 'last_target': 1479561344, 'memory_current': 1496784896, 'mem_used': 341782528, 'memory_actual': 1496784896, 'no_progress': True, 'slow_memset_react': False, 'memory_maximum': 4194304000}}

In particular I didn't see any of the oom-killer entries from above.

Polygonbugs commented 6 years ago

I reproduced OOM-killer with my first attempt but failed to create with my second attempt... I make this issue by starting new ProxyVM(memory balancing enabled) though dom0 has already stated "no sufficient memory" to that VM. It was occured while system memory was almost out of and updating Windows 7 HVM (memory balancing disabled, 4Gib is attached).

Both Xorg and xenwatch was invoked this case. But first one was xenwatch. Below is short summary about my first attempt with multiple(9 times) OOM-killers activated consecutively.

tl;dr

Feb 12 16:45:30 dom0 kernel: xenwatch invoked oom-killer: gfp_mask=0x240c0c0(GFP_KERNEL|GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0 Feb 12 16:45:30 dom0 kernel: xenwatch cpuset=/ mems_allowed=0 Feb 12 16:45:30 dom0 kernel: CPU: 6 PID: 73 Comm: xenwatch Tainted: G U 4.9.56-21.pvops.qubes.x86_64 #1 Feb 12 16:45:30 dom0 kernel: ffffc9000130b948 ffffffff813e8e03 ffffc9000130bb18 ffff88009dc78000 Feb 12 16:45:30 dom0 kernel: ffffc9000130b9c0 ffffffff81241efd 0000000000000003 0000000000000000 Feb 12 16:45:30 dom0 kernel: 0000000000000000 0000000000000000 ffff880134799b20 0000000000000001 Feb 12 16:45:30 dom0 kernel: Call Trace: Feb 12 16:45:30 dom0 kernel: [] dump_stack+0x63/0x90 Feb 12 16:45:30 dom0 kernel: [] dump_header+0x7b/0x1fd Feb 12 16:45:30 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:30 dom0 kernel: [] oom_kill_process+0x219/0x3e0 Feb 12 16:45:30 dom0 kernel: [] out_of_memory+0x143/0x4d0 Feb 12 16:45:30 dom0 kernel: [] alloc_pages_slowpath+0xbaa/0xbf0 Feb 12 16:45:30 dom0 kernel: [] __alloc_pages_nodemask+0x256/0x2c0 Feb 12 16:45:30 dom0 kernel: [] alloc_pages_current+0x95/0x140 Feb 12 16:45:30 dom0 kernel: [] kmalloc_order+0x18/0x40 Feb 12 16:45:30 dom0 kernel: [] kmalloc_order_trace+0x24/0xa0 Feb 12 16:45:30 dom0 kernel: [] ? read_per_ring_refs+0x30f/0x400 [xen_blkback] Feb 12 16:45:30 dom0 kernel: [] read_per_ring_refs+0x114/0x400 [xen_blkback] Feb 12 16:45:30 dom0 kernel: [] ? snprintf+0x49/0x60 Feb 12 16:45:30 dom0 kernel: [] connect_ring.isra.7+0x351/0x4c0 [xen_blkback] Feb 12 16:45:30 dom0 kernel: [] frontend_changed+0xda/0x140 [xen_blkback] Feb 12 16:45:30 dom0 kernel: [] xenbus_otherend_changed+0x103/0x110 Feb 12 16:45:30 dom0 kernel: [] ? register_xenbus_watch+0xf0/0xf0 Feb 12 16:45:30 dom0 kernel: [] frontend_changed+0x10/0x20 Feb 12 16:45:30 dom0 kernel: [] xenwatch_thread+0x77/0x120 Feb 12 16:45:30 dom0 kernel: [] ? wake_atomic_t_function+0x70/0x70 Feb 12 16:45:30 dom0 kernel: [] kthread+0xe6/0x100 Feb 12 16:45:30 dom0 kernel: [] ? kthread_park+0x60/0x60 Feb 12 16:45:30 dom0 kernel: [] ret_from_fork+0x25/0x30 Feb 12 16:45:30 dom0 kernel: Mem-Info: Feb 12 16:45:30 dom0 kernel: active_anon:87238 inactive_anon:142990 isolated_anon:0 active_file:25595 inactive_file:25641 isolated_file:0 unevictable:74 dirty:16 writeback:0 unstable:0 slab_reclaimable:13731 slab_unreclaimable:20190 mapped:51329 shmem:117301 pagetables:6974 bounce:0 free:19266 free_pcp:132 free_cma:0


Feb 12 16:45:33 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0 Feb 12 16:45:33 dom0 kernel: Xorg cpuset=/ mems_allowed=0 Feb 12 16:45:33 dom0 kernel: CPU: 6 PID: 12008 Comm: Xorg Tainted: G U 4.9.56-21.pvops.qubes.x86_64 #1 Feb 12 16:45:33 dom0 kernel: ffffc90008ee7500 ffffffff813e8e03 ffffc90008ee76d0 ffff880002911e80 Feb 12 16:45:33 dom0 kernel: ffffc90008ee7578 ffffffff81241efd 0000000000000003 0000000000000000 Feb 12 16:45:33 dom0 kernel: 0000000000000000 0000000000000000 ffff880134799b20 0000000000000002 Feb 12 16:45:33 dom0 kernel: Call Trace: Feb 12 16:45:33 dom0 kernel: [] dump_stack+0x63/0x90 Feb 12 16:45:33 dom0 kernel: [] dump_header+0x7b/0x1fd Feb 12 16:45:33 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:33 dom0 kernel: [] oom_kill_process+0x219/0x3e0 Feb 12 16:45:33 dom0 kernel: [] out_of_memory+0x143/0x4d0 Feb 12 16:45:33 dom0 kernel: [] alloc_pages_slowpath+0xbaa/0xbf0 Feb 12 16:45:33 dom0 kernel: [] alloc_pages_nodemask+0x256/0x2c0 Feb 12 16:45:33 dom0 kernel: [] alloc_pages_current+0x95/0x140 Feb 12 16:45:33 dom0 kernel: [] kmalloc_order+0x18/0x40 Feb 12 16:45:33 dom0 kernel: [] kmalloc_order_trace+0x24/0xa0 Feb 12 16:45:33 dom0 kernel: [] kmalloc+0x1f3/0x210 Feb 12 16:45:33 dom0 kernel: [] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915] Feb 12 16:45:33 dom0 kernel: [] alloc_gen8_temp_bitmaps+0x47/0x80 [i915] Feb 12 16:45:33 dom0 kernel: [] gen8_alloc_va_range_3lvl+0xa0/0x9b0 [i915] Feb 12 16:45:33 dom0 kernel: [] ? find_lock_entry+0x5b/0x120 Feb 12 16:45:33 dom0 kernel: [] ? wake_up_common+0x52/0x90 Feb 12 16:45:33 dom0 kernel: [] ? wake_up+0x44/0x50 Feb 12 16:45:33 dom0 kernel: [] ? xen_swiotlb_map_sg_attrs+0x1b3/0x2e0 Feb 12 16:45:33 dom0 kernel: [] gen8_alloc_va_range+0x23e/0x490 [i915] Feb 12 16:45:33 dom0 kernel: [] i915_vma_bind+0x6d/0x120 [i915] Feb 12 16:45:33 dom0 kernel: [] i915_vma_do_pin+0x261/0x480 [i915] Feb 12 16:45:33 dom0 kernel: [] i915_gem_execbuffer_reserve_vma.isra.42+0x144/0x1b0 [i915] Feb 12 16:45:33 dom0 kernel: [] i915_gem_execbuffer_reserve.isra.43+0x395/0x3c0 [i915] Feb 12 16:45:33 dom0 kernel: [] i915_gem_do_execbuffer.isra.48+0x511/0x16b0 [i915] Feb 12 16:45:33 dom0 kernel: [] ? idr_mark_full+0x5a/0x70 Feb 12 16:45:33 dom0 kernel: [] ? sigqueue_free.part.18+0x33/0x40 Feb 12 16:45:33 dom0 kernel: [] ? xen_hypercall_xen_version+0xa/0x20 Feb 12 16:45:33 dom0 kernel: [] i915_gem_execbuffer2+0xa2/0x1e0 [i915] Feb 12 16:45:33 dom0 kernel: [] drm_ioctl+0x21b/0x4d0 [drm] Feb 12 16:45:33 dom0 kernel: [] ? i915_gem_execbuffer+0x310/0x310 [i915] Feb 12 16:45:33 dom0 kernel: [] ? pick_next_entity+0x80/0x130 Feb 12 16:45:33 dom0 kernel: [] ? fpurestore_sig+0x9a/0x540 Feb 12 16:45:33 dom0 kernel: [] do_vfs_ioctl+0xa1/0x5e0 Feb 12 16:45:33 dom0 kernel: [] ? recalc_sigpending+0x1b/0x50 Feb 12 16:45:33 dom0 kernel: [] ? __audit_syscall_entry+0xab/0xf0 Feb 12 16:45:33 dom0 kernel: [] ? syscall_trace_enter+0x1bd/0x2d0 Feb 12 16:45:33 dom0 kernel: [] SyS_ioctl+0x79/0x90 Feb 12 16:45:33 dom0 kernel: [] do_syscall_64+0x67/0x180 Feb 12 16:45:33 dom0 kernel: [] entry_SYSCALL64_slow_path+0x25/0x25 Feb 12 16:45:33 dom0 kernel: Mem-Info: Feb 12 16:45:33 dom0 kernel: active_anon:21338 inactive_anon:93444 isolated_anon:0 active_file:34120 inactive_file:49084 isolated_file:0 unevictable:41 dirty:4293 writeback:8 unstable:0 slab_reclaimable:13747 slab_unreclaimable:20612 mapped:23853 shmem:83335 pagetables:2770 bounce:0 free:9954 free_pcp:357 free_cma:0


Feb 12 16:45:33 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0 Feb 12 16:45:33 dom0 kernel: Xorg cpuset=/ mems_allowed=0 Feb 12 16:45:33 dom0 kernel: CPU: 4 PID: 12008 Comm: Xorg Tainted: G U 4.9.56-21.pvops.qubes.x86_64 #1 Feb 12 16:45:33 dom0 kernel: ffffc90008ee7500 ffffffff813e8e03 ffffc90008ee76d0 ffff8801359c5b80 Feb 12 16:45:33 dom0 kernel: ffffc90008ee7578 ffffffff81241efd 0000000000000003 0000000000000000 Feb 12 16:45:33 dom0 kernel: 0000000000000000 0000000000000000 ffff880134799b20 0000000000000003 Feb 12 16:45:33 dom0 kernel: Call Trace: Feb 12 16:45:33 dom0 kernel: [] dump_stack+0x63/0x90 Feb 12 16:45:33 dom0 kernel: [] dump_header+0x7b/0x1fd Feb 12 16:45:33 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:33 dom0 kernel: [] oom_kill_process+0x219/0x3e0 Feb 12 16:45:33 dom0 kernel: [] out_of_memory+0x143/0x4d0 Feb 12 16:45:33 dom0 kernel: [] alloc_pages_slowpath+0xbaa/0xbf0 Feb 12 16:45:33 dom0 kernel: [] alloc_pages_nodemask+0x256/0x2c0 Feb 12 16:45:33 dom0 kernel: [] alloc_pages_current+0x95/0x140 Feb 12 16:45:33 dom0 kernel: [] kmalloc_order+0x18/0x40 Feb 12 16:45:33 dom0 kernel: [] kmalloc_order_trace+0x24/0xa0 Feb 12 16:45:33 dom0 kernel: [] kmalloc+0x1f3/0x210 Feb 12 16:45:33 dom0 kernel: [] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915] Feb 12 16:45:33 dom0 kernel: [] alloc_gen8_temp_bitmaps+0x47/0x80 [i915] Feb 12 16:45:33 dom0 kernel: [] gen8_alloc_va_range_3lvl+0xa0/0x9b0 [i915] Feb 12 16:45:33 dom0 kernel: [] ? find_lock_entry+0x5b/0x120 Feb 12 16:45:33 dom0 kernel: [] ? wake_up_common+0x52/0x90 Feb 12 16:45:33 dom0 kernel: [] ? wake_up+0x44/0x50 Feb 12 16:45:33 dom0 kernel: [] ? xen_swiotlb_map_sg_attrs+0x1b3/0x2e0 Feb 12 16:45:33 dom0 kernel: [] gen8_alloc_va_range+0x23e/0x490 [i915] Feb 12 16:45:33 dom0 kernel: [] i915_vma_bind+0x6d/0x120 [i915] Feb 12 16:45:33 dom0 kernel: [] i915_vma_do_pin+0x261/0x480 [i915] Feb 12 16:45:33 dom0 kernel: [] i915_gem_execbuffer_reserve_vma.isra.42+0x144/0x1b0 [i915] Feb 12 16:45:33 dom0 kernel: [] i915_gem_execbuffer_reserve.isra.43+0x395/0x3c0 [i915] Feb 12 16:45:33 dom0 kernel: [] i915_gem_do_execbuffer.isra.48+0x511/0x16b0 [i915] Feb 12 16:45:33 dom0 kernel: [] ? idr_mark_full+0x5a/0x70 Feb 12 16:45:33 dom0 kernel: [] ? sigqueue_free.part.18+0x33/0x40 Feb 12 16:45:33 dom0 kernel: [] ? xen_hypercall_xen_version+0xa/0x20 Feb 12 16:45:33 dom0 kernel: [] i915_gem_execbuffer2+0xa2/0x1e0 [i915] Feb 12 16:45:33 dom0 kernel: [] drm_ioctl+0x21b/0x4d0 [drm] Feb 12 16:45:33 dom0 kernel: [] ? i915_gem_execbuffer+0x310/0x310 [i915] Feb 12 16:45:33 dom0 kernel: [] ? pick_next_entity+0x80/0x130 Feb 12 16:45:33 dom0 kernel: [] ? fpurestore_sig+0x9a/0x540 Feb 12 16:45:33 dom0 kernel: [] do_vfs_ioctl+0xa1/0x5e0 Feb 12 16:45:33 dom0 kernel: [] ? recalc_sigpending+0x1b/0x50 Feb 12 16:45:33 dom0 kernel: [] ? __audit_syscall_entry+0xab/0xf0 Feb 12 16:45:33 dom0 kernel: [] ? syscall_trace_enter+0x1bd/0x2d0 Feb 12 16:45:33 dom0 kernel: [] SyS_ioctl+0x79/0x90 Feb 12 16:45:33 dom0 kernel: [] do_syscall_64+0x67/0x180 Feb 12 16:45:33 dom0 kernel: [] entry_SYSCALL64_slow_path+0x25/0x25 Feb 12 16:45:33 dom0 kernel: Mem-Info: Feb 12 16:45:33 dom0 kernel: active_anon:17637 inactive_anon:90605 isolated_anon:0 active_file:34343 inactive_file:49456 isolated_file:0 unevictable:41 dirty:4358 writeback:7 unstable:0 slab_reclaimable:13743 slab_unreclaimable:20612 mapped:14942 shmem:83335 pagetables:2588 bounce:0 free:15982 free_pcp:509 free_cma:0


eb 12 16:45:46 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0 Feb 12 16:45:46 dom0 kernel: Xorg cpuset=/ mems_allowed=0 Feb 12 16:45:46 dom0 kernel: CPU: 2 PID: 12116 Comm: Xorg Tainted: G U 4.9.56-21.pvops.qubes.x86_64 #1 Feb 12 16:45:46 dom0 kernel: ffffc900015d7500 ffffffff813e8e03 ffffc900015d76d0 ffff880080a2db80 Feb 12 16:45:46 dom0 kernel: ffffc900015d7578 ffffffff81241efd 0000000000000003 0000000000000000 Feb 12 16:45:46 dom0 kernel: 0000000000000000 0000000000000000 ffff880134799b20 0000000000000001 Feb 12 16:45:46 dom0 kernel: Call Trace: Feb 12 16:45:46 dom0 kernel: [] dump_stack+0x63/0x90 Feb 12 16:45:46 dom0 kernel: [] dump_header+0x7b/0x1fd Feb 12 16:45:46 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:46 dom0 kernel: [] oom_kill_process+0x219/0x3e0 Feb 12 16:45:46 dom0 kernel: [] out_of_memory+0x143/0x4d0 Feb 12 16:45:46 dom0 kernel: [] alloc_pages_slowpath+0xbaa/0xbf0 Feb 12 16:45:46 dom0 kernel: [] __alloc_pages_nodemask+0x256/0x2c0 Feb 12 16:45:46 dom0 kernel: [] alloc_pages_current+0x95/0x140 Feb 12 16:45:46 dom0 kernel: [] kmalloc_order+0x18/0x40 Feb 12 16:45:46 dom0 kernel: [] kmalloc_order_trace+0x24/0xa0 Feb 12 16:45:46 dom0 kernel: [] kmalloc+0x1f3/0x210 Feb 12 16:45:46 dom0 kernel: [] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915] Feb 12 16:45:46 dom0 kernel: [] alloc_gen8_temp_bitmaps+0x47/0x80 [i915] Feb 12 16:45:46 dom0 kernel: [] gen8_alloc_va_range_3lvl+0xa0/0x9b0 [i915] Feb 12 16:45:46 dom0 kernel: [] ? find_lock_entry+0x5b/0x120 Feb 12 16:45:46 dom0 kernel: [] ? wake_up_common+0x52/0x90 Feb 12 16:45:46 dom0 kernel: [] ? wake_up+0x44/0x50 Feb 12 16:45:46 dom0 kernel: [] ? xen_swiotlb_map_sg_attrs+0x1b3/0x2e0 Feb 12 16:45:46 dom0 kernel: [] gen8_alloc_va_range+0x23e/0x490 [i915] Feb 12 16:45:46 dom0 kernel: [] i915_vma_bind+0x6d/0x120 [i915] Feb 12 16:45:46 dom0 kernel: [] i915_vma_do_pin+0x261/0x480 [i915] Feb 12 16:45:46 dom0 kernel: [] i915_gem_execbuffer_reserve_vma.isra.42+0x144/0x1b0 [i915] Feb 12 16:45:46 dom0 kernel: [] i915_gem_execbuffer_reserve.isra.43+0x395/0x3c0 [i915] Feb 12 16:45:46 dom0 kernel: [] i915_gem_do_execbuffer.isra.48+0x511/0x16b0 [i915] Feb 12 16:45:46 dom0 kernel: [] ? finish_task_switch+0x82/0x210 Feb 12 16:45:46 dom0 kernel: [] ? schedule+0x36/0x80 Feb 12 16:45:46 dom0 kernel: [] ? futex_wait_queue_me+0xd3/0x120 Feb 12 16:45:46 dom0 kernel: [] i915_gem_execbuffer2+0xa2/0x1e0 [i915] Feb 12 16:45:46 dom0 kernel: [] drm_ioctl+0x21b/0x4d0 [drm] Feb 12 16:45:46 dom0 kernel: [] ? i915_gem_execbuffer+0x310/0x310 [i915] Feb 12 16:45:46 dom0 kernel: [] ? do_futex+0x1f1/0x530 Feb 12 16:45:46 dom0 kernel: [] do_vfs_ioctl+0xa1/0x5e0 Feb 12 16:45:46 dom0 kernel: [] ? __audit_syscall_entry+0xab/0xf0 Feb 12 16:45:46 dom0 kernel: [] ? syscall_trace_enter+0x1bd/0x2d0 Feb 12 16:45:46 dom0 kernel: [] SyS_ioctl+0x79/0x90 Feb 12 16:45:46 dom0 kernel: [] do_syscall_64+0x67/0x180 Feb 12 16:45:46 dom0 kernel: [] entry_SYSCALL64_slow_path+0x25/0x25 Feb 12 16:45:46 dom0 kernel: Mem-Info: Feb 12 16:45:46 dom0 kernel: active_anon:15154 inactive_anon:91882 isolated_anon:0 active_file:42747 inactive_file:43579 isolated_file:0 unevictable:41 dirty:790 writeback:35 unstable:0 slab_reclaimable:14852 slab_unreclaimable:22486 mapped:24888 shmem:81356 pagetables:3717 bounce:0 free:10048 free_pcp:979 free_cma:0


Feb 12 16:45:47 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0 Feb 12 16:45:47 dom0 kernel: Xorg cpuset=/ mems_allowed=0 Feb 12 16:45:47 dom0 kernel: CPU: 0 PID: 12116 Comm: Xorg Tainted: G U 4.9.56-21.pvops.qubes.x86_64 #1 Feb 12 16:45:47 dom0 kernel: ffffc900015d7500 ffffffff813e8e03 ffffc900015d76d0 ffff880080a2db80 Feb 12 16:45:47 dom0 kernel: ffffc900015d7578 ffffffff81241efd 0000000000000003 0000000000000000 Feb 12 16:45:47 dom0 kernel: 0000138f262a0e5e 00000000002354d9 0000000000000000 0000000000000002 Feb 12 16:45:47 dom0 kernel: Call Trace: Feb 12 16:45:47 dom0 kernel: [] dump_stack+0x63/0x90 Feb 12 16:45:47 dom0 kernel: [] dump_header+0x7b/0x1fd Feb 12 16:45:47 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:47 dom0 kernel: [] oom_kill_process+0x219/0x3e0 Feb 12 16:45:47 dom0 kernel: [] out_of_memory+0x143/0x4d0 Feb 12 16:45:47 dom0 kernel: [] alloc_pages_slowpath+0xbaa/0xbf0 Feb 12 16:45:47 dom0 kernel: [] alloc_pages_nodemask+0x256/0x2c0 Feb 12 16:45:47 dom0 kernel: [] alloc_pages_current+0x95/0x140 Feb 12 16:45:47 dom0 kernel: [] kmalloc_order+0x18/0x40 Feb 12 16:45:47 dom0 kernel: [] kmalloc_order_trace+0x24/0xa0 Feb 12 16:45:47 dom0 kernel: [] kmalloc+0x1f3/0x210 Feb 12 16:45:47 dom0 kernel: [] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915] Feb 12 16:45:47 dom0 kernel: [] alloc_gen8_temp_bitmaps+0x47/0x80 [i915] Feb 12 16:45:47 dom0 kernel: [] gen8_alloc_va_range_3lvl+0xa0/0x9b0 [i915] Feb 12 16:45:47 dom0 kernel: [] ? find_lock_entry+0x5b/0x120 Feb 12 16:45:47 dom0 kernel: [] ? wake_up_common+0x52/0x90 Feb 12 16:45:47 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:47 dom0 kernel: [] ? wake_up+0x44/0x50 Feb 12 16:45:47 dom0 kernel: [] ? xen_swiotlb_map_sg_attrs+0x1b3/0x2e0 Feb 12 16:45:47 dom0 kernel: [] gen8_alloc_va_range+0x23e/0x490 [i915] Feb 12 16:45:47 dom0 kernel: [] i915_vma_bind+0x6d/0x120 [i915] Feb 12 16:45:47 dom0 kernel: [] i915_vma_do_pin+0x261/0x480 [i915] Feb 12 16:45:47 dom0 kernel: [] i915_gem_execbuffer_reserve_vma.isra.42+0x144/0x1b0 [i915] Feb 12 16:45:47 dom0 kernel: [] i915_gem_execbuffer_reserve.isra.43+0x395/0x3c0 [i915] Feb 12 16:45:47 dom0 kernel: [] i915_gem_do_execbuffer.isra.48+0x511/0x16b0 [i915] Feb 12 16:45:47 dom0 kernel: [] ? shmem_add_to_page_cache+0x15c/0x2f0 Feb 12 16:45:47 dom0 kernel: [] ? lru_cache_add_file+0x37/0x40 Feb 12 16:45:47 dom0 kernel: [] i915_gem_execbuffer2+0xa2/0x1e0 [i915] Feb 12 16:45:47 dom0 kernel: [] drm_ioctl+0x21b/0x4d0 [drm] Feb 12 16:45:47 dom0 kernel: [] ? i915_gem_execbuffer+0x310/0x310 [i915] Feb 12 16:45:47 dom0 kernel: [] ? handle_mm_fault+0x826/0x1470 Feb 12 16:45:47 dom0 kernel: [] do_vfs_ioctl+0xa1/0x5e0 Feb 12 16:45:47 dom0 kernel: [] ? audit_syscall_entry+0xab/0xf0 Feb 12 16:45:47 dom0 kernel: [] ? syscall_trace_enter+0x1bd/0x2d0 Feb 12 16:45:47 dom0 kernel: [] SyS_ioctl+0x79/0x90 Feb 12 16:45:47 dom0 kernel: [] do_syscall_64+0x67/0x180 Feb 12 16:45:47 dom0 kernel: [] entry_SYSCALL64_slow_path+0x25/0x25 Feb 12 16:45:47 dom0 kernel: Mem-Info: Feb 12 16:45:47 dom0 kernel: active_anon:17970 inactive_anon:92310 isolated_anon:14 active_file:41560 inactive_file:42721 isolated_file:16 unevictable:41 dirty:1689 writeback:35 unstable:0 slab_reclaimable:14780 slab_unreclaimable:22538 mapped:26721 shmem:81388 pagetables:4164 bounce:0 free:8402 free_pcp:638 free_cma:0


Feb 12 16:45:47 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0 Feb 12 16:45:47 dom0 kernel: Xorg cpuset=/ mems_allowed=0 Feb 12 16:45:47 dom0 kernel: CPU: 6 PID: 12116 Comm: Xorg Tainted: G U 4.9.56-21.pvops.qubes.x86_64 #1 Feb 12 16:45:47 dom0 kernel: ffffc900015d7500 ffffffff813e8e03 ffffc900015d76d0 ffff8800b6020000 Feb 12 16:45:47 dom0 kernel: ffffc900015d7578 ffffffff81241efd 0000000000000003 0000000000000000 Feb 12 16:45:47 dom0 kernel: 0000000000000000 0000000000000000 ffff880134799b20 0000000000000003 Feb 12 16:45:47 dom0 kernel: Call Trace: Feb 12 16:45:47 dom0 kernel: [] dump_stack+0x63/0x90 Feb 12 16:45:47 dom0 kernel: [] dump_header+0x7b/0x1fd Feb 12 16:45:47 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:47 dom0 kernel: [] oom_kill_process+0x219/0x3e0 Feb 12 16:45:47 dom0 kernel: [] out_of_memory+0x143/0x4d0 Feb 12 16:45:47 dom0 kernel: [] alloc_pages_slowpath+0xbaa/0xbf0 Feb 12 16:45:47 dom0 kernel: [] alloc_pages_nodemask+0x256/0x2c0 Feb 12 16:45:47 dom0 kernel: [] alloc_pages_current+0x95/0x140 Feb 12 16:45:47 dom0 kernel: [] kmalloc_order+0x18/0x40 Feb 12 16:45:47 dom0 kernel: [] kmalloc_order_trace+0x24/0xa0 Feb 12 16:45:47 dom0 kernel: [] kmalloc+0x1f3/0x210 Feb 12 16:45:47 dom0 kernel: [] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915] Feb 12 16:45:47 dom0 kernel: [] alloc_gen8_temp_bitmaps+0x47/0x80 [i915] Feb 12 16:45:47 dom0 kernel: [] gen8_alloc_va_range_3lvl+0xa0/0x9b0 [i915] Feb 12 16:45:47 dom0 kernel: [] ? find_lock_entry+0x5b/0x120 Feb 12 16:45:47 dom0 kernel: [] ? wake_up_common+0x52/0x90 Feb 12 16:45:47 dom0 kernel: [] ? wake_up+0x44/0x50 Feb 12 16:45:47 dom0 kernel: [] ? xen_swiotlb_map_sg_attrs+0x1b3/0x2e0 Feb 12 16:45:47 dom0 kernel: [] gen8_alloc_va_range+0x23e/0x490 [i915] Feb 12 16:45:47 dom0 kernel: [] i915_vma_bind+0x6d/0x120 [i915] Feb 12 16:45:47 dom0 kernel: [] i915_vma_do_pin+0x261/0x480 [i915] Feb 12 16:45:47 dom0 kernel: [] i915_gem_execbuffer_reserve_vma.isra.42+0x144/0x1b0 [i915] Feb 12 16:45:47 dom0 kernel: [] i915_gem_execbuffer_reserve.isra.43+0x395/0x3c0 [i915] Feb 12 16:45:47 dom0 kernel: [] i915_gem_do_execbuffer.isra.48+0x511/0x16b0 [i915] Feb 12 16:45:47 dom0 kernel: [] ? finish_task_switch+0x82/0x210 Feb 12 16:45:47 dom0 kernel: [] ? schedule+0x36/0x80 Feb 12 16:45:47 dom0 kernel: [] ? futex_wait_queue_me+0xd3/0x120 Feb 12 16:45:47 dom0 kernel: [] i915_gem_execbuffer2+0xa2/0x1e0 [i915] Feb 12 16:45:47 dom0 kernel: [] drm_ioctl+0x21b/0x4d0 [drm] Feb 12 16:45:47 dom0 kernel: [] ? i915_gem_execbuffer+0x310/0x310 [i915] Feb 12 16:45:47 dom0 kernel: [] ? do_futex+0x1f1/0x530 Feb 12 16:45:47 dom0 kernel: [] ? __switch_to+0x4d5/0x620 Feb 12 16:45:47 dom0 kernel: [] do_vfs_ioctl+0xa1/0x5e0 Feb 12 16:45:47 dom0 kernel: [] ? audit_syscall_entry+0xab/0xf0 Feb 12 16:45:47 dom0 kernel: [] ? syscall_trace_enter+0x1bd/0x2d0 Feb 12 16:45:47 dom0 kernel: [] SyS_ioctl+0x79/0x90 Feb 12 16:45:47 dom0 kernel: [] do_syscall_64+0x67/0x180 Feb 12 16:45:47 dom0 kernel: [] entry_SYSCALL64_slow_path+0x25/0x25 Feb 12 16:45:47 dom0 kernel: Mem-Info: Feb 12 16:45:47 dom0 kernel: active_anon:26661 inactive_anon:92991 isolated_anon:0 active_file:37370 inactive_file:35824 isolated_file:0 unevictable:41 dirty:1409 writeback:0 unstable:0 slab_reclaimable:14798 slab_unreclaimable:23004 mapped:30589 shmem:80121 pagetables:5140 bounce:0 free:7978 free_pcp:1045 free_cma:0


Feb 12 16:45:48 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0 Feb 12 16:45:48 dom0 kernel: Xorg cpuset=/ mems_allowed=0 Feb 12 16:45:48 dom0 kernel: CPU: 0 PID: 12116 Comm: Xorg Tainted: G U 4.9.56-21.pvops.qubes.x86_64 #1 Feb 12 16:45:48 dom0 kernel: ffffc900015d7500 ffffffff813e8e03 ffffc900015d76d0 ffff8800af28bd00 Feb 12 16:45:48 dom0 kernel: ffffc900015d7578 ffffffff81241efd 0000000000000003 0000000000000000 Feb 12 16:45:48 dom0 kernel: 0000000000000000 0000000000000000 ffff880134799b20 0000000000000004 Feb 12 16:45:48 dom0 kernel: Call Trace: Feb 12 16:45:48 dom0 kernel: [] dump_stack+0x63/0x90 Feb 12 16:45:48 dom0 kernel: [] dump_header+0x7b/0x1fd Feb 12 16:45:48 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:48 dom0 kernel: [] oom_kill_process+0x219/0x3e0 Feb 12 16:45:48 dom0 kernel: [] out_of_memory+0x143/0x4d0 Feb 12 16:45:48 dom0 kernel: [] alloc_pages_slowpath+0xbaa/0xbf0 Feb 12 16:45:48 dom0 kernel: [] alloc_pages_nodemask+0x256/0x2c0 Feb 12 16:45:48 dom0 kernel: [] alloc_pages_current+0x95/0x140 Feb 12 16:45:48 dom0 kernel: [] kmalloc_order+0x18/0x40 Feb 12 16:45:48 dom0 kernel: [] kmalloc_order_trace+0x24/0xa0 Feb 12 16:45:48 dom0 kernel: [] kmalloc+0x1f3/0x210 Feb 12 16:45:48 dom0 kernel: [] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915] Feb 12 16:45:48 dom0 kernel: [] alloc_gen8_temp_bitmaps+0x47/0x80 [i915] Feb 12 16:45:48 dom0 kernel: [] gen8_alloc_va_range_3lvl+0xa0/0x9b0 [i915] Feb 12 16:45:48 dom0 kernel: [] ? find_lock_entry+0x5b/0x120 Feb 12 16:45:48 dom0 kernel: [] ? wake_up_common+0x52/0x90 Feb 12 16:45:48 dom0 kernel: [] ? wake_up+0x44/0x50 Feb 12 16:45:48 dom0 kernel: [] ? xen_swiotlb_map_sg_attrs+0x1b3/0x2e0 Feb 12 16:45:48 dom0 kernel: [] gen8_alloc_va_range+0x23e/0x490 [i915] Feb 12 16:45:48 dom0 kernel: [] i915_vma_bind+0x6d/0x120 [i915] Feb 12 16:45:48 dom0 kernel: [] i915_vma_do_pin+0x261/0x480 [i915] Feb 12 16:45:48 dom0 kernel: [] i915_gem_execbuffer_reserve_vma.isra.42+0x144/0x1b0 [i915] Feb 12 16:45:48 dom0 kernel: [] i915_gem_execbuffer_reserve.isra.43+0x395/0x3c0 [i915] Feb 12 16:45:48 dom0 kernel: [] i915_gem_do_execbuffer.isra.48+0x511/0x16b0 [i915] Feb 12 16:45:48 dom0 kernel: [] ? finish_task_switch+0x82/0x210 Feb 12 16:45:48 dom0 kernel: [] ? schedule+0x36/0x80 Feb 12 16:45:48 dom0 kernel: [] ? futex_wait_queue_me+0xd3/0x120 Feb 12 16:45:48 dom0 kernel: [] i915_gem_execbuffer2+0xa2/0x1e0 [i915] Feb 12 16:45:48 dom0 kernel: [] drm_ioctl+0x21b/0x4d0 [drm] Feb 12 16:45:48 dom0 kernel: [] ? i915_gem_execbuffer+0x310/0x310 [i915] Feb 12 16:45:48 dom0 kernel: [] ? do_futex+0x1f1/0x530 Feb 12 16:45:48 dom0 kernel: [] ? __switch_to+0x4d5/0x620 Feb 12 16:45:48 dom0 kernel: [] do_vfs_ioctl+0xa1/0x5e0 Feb 12 16:45:48 dom0 kernel: [] ? audit_syscall_entry+0xab/0xf0 Feb 12 16:45:48 dom0 kernel: [] ? syscall_trace_enter+0x1bd/0x2d0 Feb 12 16:45:48 dom0 kernel: [] SyS_ioctl+0x79/0x90 Feb 12 16:45:48 dom0 kernel: [] do_syscall_64+0x67/0x180 Feb 12 16:45:48 dom0 kernel: [] entry_SYSCALL64_slow_path+0x25/0x25 Feb 12 16:45:48 dom0 kernel: Mem-Info: Feb 12 16:45:48 dom0 kernel: active_anon:25109 inactive_anon:93114 isolated_anon:0 active_file:38118 inactive_file:37064 isolated_file:0 unevictable:41 dirty:2442 writeback:0 unstable:0 slab_reclaimable:14802 slab_unreclaimable:22984 mapped:29173 shmem:80121 pagetables:4932 bounce:0 free:8172 free_pcp:478 free_cma:0


Feb 12 16:45:55 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0 Feb 12 16:45:55 dom0 kernel: Xorg cpuset=/ mems_allowed=0 Feb 12 16:45:55 dom0 kernel: CPU: 0 PID: 12363 Comm: Xorg Tainted: G U 4.9.56-21.pvops.qubes.x86_64 #1 Feb 12 16:45:55 dom0 kernel: ffffc90003dbb500 ffffffff813e8e03 ffffc90003dbb6d0 ffff8801343a9e80 Feb 12 16:45:55 dom0 kernel: ffffc90003dbb578 ffffffff81241efd 0000000000000003 0000000000000000 Feb 12 16:45:55 dom0 kernel: 0000000000000000 0000000000000000 ffff880134799b20 0000000000000001 Feb 12 16:45:55 dom0 kernel: Call Trace: Feb 12 16:45:55 dom0 kernel: [] dump_stack+0x63/0x90 Feb 12 16:45:55 dom0 kernel: [] dump_header+0x7b/0x1fd Feb 12 16:45:55 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:55 dom0 kernel: [] oom_kill_process+0x219/0x3e0 Feb 12 16:45:55 dom0 kernel: [] out_of_memory+0x143/0x4d0 Feb 12 16:45:55 dom0 kernel: [] alloc_pages_slowpath+0xbaa/0xbf0 Feb 12 16:45:55 dom0 kernel: [] alloc_pages_nodemask+0x256/0x2c0 Feb 12 16:45:55 dom0 kernel: [] alloc_pages_current+0x95/0x140 Feb 12 16:45:55 dom0 kernel: [] kmalloc_order+0x18/0x40 Feb 12 16:45:55 dom0 kernel: [] kmalloc_order_trace+0x24/0xa0 Feb 12 16:45:55 dom0 kernel: [] ? try_charge+0x352/0x710 Feb 12 16:45:55 dom0 kernel: [] kmalloc+0x1f3/0x210 Feb 12 16:45:55 dom0 kernel: [] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915] Feb 12 16:45:55 dom0 kernel: [] alloc_gen8_temp_bitmaps+0x47/0x80 [i915] Feb 12 16:45:55 dom0 kernel: [] gen8_alloc_va_range_3lvl+0xa0/0x9b0 [i915] Feb 12 16:45:55 dom0 kernel: [] ? lru_cache_add_file+0x37/0x40 Feb 12 16:45:55 dom0 kernel: [] ? xen_swiotlb_map_sg_attrs+0x1b3/0x2e0 Feb 12 16:45:55 dom0 kernel: [] gen8_alloc_va_range+0x23e/0x490 [i915] Feb 12 16:45:55 dom0 kernel: [] i915_vma_bind+0x6d/0x120 [i915] Feb 12 16:45:55 dom0 kernel: [] i915_vma_do_pin+0x261/0x480 [i915] Feb 12 16:45:55 dom0 kernel: [] i915_gem_execbuffer_reserve_vma.isra.42+0x144/0x1b0 [i915] Feb 12 16:45:55 dom0 kernel: [] i915_gem_execbuffer_reserve.isra.43+0x395/0x3c0 [i915] Feb 12 16:45:55 dom0 kernel: [] i915_gem_do_execbuffer.isra.48+0x511/0x16b0 [i915] Feb 12 16:45:55 dom0 kernel: [] ? idr_get_empty_slot+0x199/0x3b0 Feb 12 16:45:55 dom0 kernel: [] ? idr_alloc+0x9a/0x110 Feb 12 16:45:55 dom0 kernel: [] ? drm_vma_node_allow+0xa4/0xc0 [drm] Feb 12 16:45:55 dom0 kernel: [] i915_gem_execbuffer2+0xa2/0x1e0 [i915] Feb 12 16:45:55 dom0 kernel: [] drm_ioctl+0x21b/0x4d0 [drm] Feb 12 16:45:55 dom0 kernel: [] ? i915_gem_execbuffer+0x310/0x310 [i915] Feb 12 16:45:55 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:55 dom0 kernel: [] ? hrtimer_start_range_ns+0x1db/0x3f0 Feb 12 16:45:55 dom0 kernel: [] do_vfs_ioctl+0xa1/0x5e0 Feb 12 16:45:55 dom0 kernel: [] ? audit_syscall_entry+0xab/0xf0 Feb 12 16:45:55 dom0 kernel: [] ? syscall_trace_enter+0x1bd/0x2d0 Feb 12 16:45:55 dom0 kernel: [] SyS_ioctl+0x79/0x90 Feb 12 16:45:55 dom0 kernel: [] do_syscall_64+0x67/0x180 Feb 12 16:45:55 dom0 kernel: [] entry_SYSCALL64_slow_path+0x25/0x25 Feb 12 16:45:55 dom0 kernel: Mem-Info: Feb 12 16:45:55 dom0 kernel: active_anon:14019 inactive_anon:87347 isolated_anon:0 active_file:41654 inactive_file:47708 isolated_file:0 unevictable:41 dirty:5316 writeback:37 unstable:0 slab_reclaimable:14196 slab_unreclaimable:22772 mapped:26386 shmem:79991 pagetables:3395 bounce:0 free:15369 free_pcp:392 free_cma:0


Feb 12 16:45:55 dom0 kernel: Xorg invoked oom-killer: gfp_mask=0x240c0d0(GFP_TEMPORARY|GFP_COMP|__GFP_ZERO), nodemask=0, order=3, oom_score_adj=0 Feb 12 16:45:55 dom0 kernel: Xorg cpuset=/ mems_allowed=0 Feb 12 16:45:55 dom0 kernel: CPU: 4 PID: 12363 Comm: Xorg Tainted: G U 4.9.56-21.pvops.qubes.x86_64 #1 Feb 12 16:45:55 dom0 kernel: ffffc90003dbb500 ffffffff813e8e03 ffffc90003dbb6d0 ffff8800af289e80 Feb 12 16:45:55 dom0 kernel: ffffc90003dbb578 ffffffff81241efd 0000000000000003 0000000000000000 Feb 12 16:45:55 dom0 kernel: 0000000000000000 0000000000000000 ffff880134799b20 0000000000000002 Feb 12 16:45:55 dom0 kernel: Call Trace: Feb 12 16:45:55 dom0 kernel: [] dump_stack+0x63/0x90 Feb 12 16:45:55 dom0 kernel: [] dump_header+0x7b/0x1fd Feb 12 16:45:55 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:55 dom0 kernel: [] oom_kill_process+0x219/0x3e0 Feb 12 16:45:55 dom0 kernel: [] out_of_memory+0x143/0x4d0 Feb 12 16:45:55 dom0 kernel: [] alloc_pages_slowpath+0xbaa/0xbf0 Feb 12 16:45:55 dom0 kernel: [] alloc_pages_nodemask+0x256/0x2c0 Feb 12 16:45:55 dom0 kernel: [] alloc_pages_current+0x95/0x140 Feb 12 16:45:55 dom0 kernel: [] kmalloc_order+0x18/0x40 Feb 12 16:45:55 dom0 kernel: [] kmalloc_order_trace+0x24/0xa0 Feb 12 16:45:55 dom0 kernel: [] ? try_charge+0x352/0x710 Feb 12 16:45:55 dom0 kernel: [] kmalloc+0x1f3/0x210 Feb 12 16:45:55 dom0 kernel: [] ? alloc_gen8_temp_bitmaps+0x2e/0x80 [i915] Feb 12 16:45:55 dom0 kernel: [] alloc_gen8_temp_bitmaps+0x47/0x80 [i915] Feb 12 16:45:55 dom0 kernel: [] gen8_alloc_va_range_3lvl+0xa0/0x9b0 [i915] Feb 12 16:45:55 dom0 kernel: [] ? lru_cache_add_file+0x37/0x40 Feb 12 16:45:55 dom0 kernel: [] ? xen_swiotlb_map_sg_attrs+0x1b3/0x2e0 Feb 12 16:45:55 dom0 kernel: [] gen8_alloc_va_range+0x23e/0x490 [i915] Feb 12 16:45:55 dom0 kernel: [] i915_vma_bind+0x6d/0x120 [i915] Feb 12 16:45:55 dom0 kernel: [] i915_vma_do_pin+0x261/0x480 [i915] Feb 12 16:45:55 dom0 kernel: [] i915_gem_execbuffer_reserve_vma.isra.42+0x144/0x1b0 [i915] Feb 12 16:45:55 dom0 kernel: [] i915_gem_execbuffer_reserve.isra.43+0x395/0x3c0 [i915] Feb 12 16:45:55 dom0 kernel: [] i915_gem_do_execbuffer.isra.48+0x511/0x16b0 [i915] Feb 12 16:45:55 dom0 kernel: [] ? idr_get_empty_slot+0x199/0x3b0 Feb 12 16:45:55 dom0 kernel: [] ? idr_alloc+0x9a/0x110 Feb 12 16:45:55 dom0 kernel: [] ? drm_vma_node_allow+0xa4/0xc0 [drm] Feb 12 16:45:55 dom0 kernel: [] i915_gem_execbuffer2+0xa2/0x1e0 [i915] Feb 12 16:45:55 dom0 kernel: [] drm_ioctl+0x21b/0x4d0 [drm] Feb 12 16:45:55 dom0 kernel: [] ? i915_gem_execbuffer+0x310/0x310 [i915] Feb 12 16:45:55 dom0 kernel: [] ? _raw_spin_unlock_irqrestore+0x1a/0x20 Feb 12 16:45:55 dom0 kernel: [] ? hrtimer_start_range_ns+0x1db/0x3f0 Feb 12 16:45:55 dom0 kernel: [] do_vfs_ioctl+0xa1/0x5e0 Feb 12 16:45:55 dom0 kernel: [] ? audit_syscall_entry+0xab/0xf0 Feb 12 16:45:55 dom0 kernel: [] ? syscall_trace_enter+0x1bd/0x2d0 Feb 12 16:45:55 dom0 kernel: [] SyS_ioctl+0x79/0x90 Feb 12 16:45:55 dom0 kernel: [] do_syscall_64+0x67/0x180 Feb 12 16:45:55 dom0 kernel: [] entry_SYSCALL64_slow_path+0x25/0x25 Feb 12 16:45:55 dom0 kernel: Mem-Info: Feb 12 16:45:55 dom0 kernel: active_anon:13890 inactive_anon:87576 isolated_anon:0 active_file:41921 inactive_file:47429 isolated_file:0 unevictable:41 dirty:5329 writeback:37 unstable:0 slab_reclaimable:14194 slab_unreclaimable:22773 mapped:18981 shmem:79991 pagetables:3342 bounce:0 free:14979 free_pcp:699 free_cma:0

Polygonbugs commented 6 years ago

Have you tried 4.14 kernel in dom0? It's in testing repository.

-> @marmarek, I will test 4.14 kernel after one more attempt to reproduce on current kernel.

Polygonbugs commented 6 years ago

Happened again in 4.9.56-21 kernel. Also it happened while starting new vm. I'm now testing 4.14.18-1 kernel from this time.

BTW(it is off-topic), why this new kernel doesn't show on Qubes-Global-Settings and Settings of each qubes? They report that I'm still using 4.9.56-21 but "uname -r" reports me that I'm using new kernel. Also login screen seems crashed while starting system though I can login...

3hhh commented 6 years ago

I noticed that my X server mostly crashes after a while when I use my docking station. So possibly a different issue.

Polygonbugs commented 6 years ago

I haven't experienced oom-killer so far when using 4.14.18-1 kernel. But I've suffered sudden stop of my USB mouse attached to Sys-USB while dom0 is refusing to give back memory. To be specific, I think gui stopped at that moment? I can hear sound coming from my VM though.

Polygonbugs commented 6 years ago

Today, I have experienced that again but got worse. Entire system sudden freezed and it seemed that this system is not possible to recover without reboot. It exactly happened at 17:07:18 while starting up 4096MB memory assigned VM. I cannot issue any command or even tty (using ps/2 keyboard), mouse has been stopped, screen has freezed. I manually rebooted through physical way after approximately 1 minute.

After reboot, I found nothing on journalctl between this event and rebooting. But I don't think domU died because there was some network traffic on modem. BTW, here is some journalctl log which looks like everything is okay before happening.

Mar 08 17:07:17 dom0 kernel: audit: type=1100 audit(1520496437.757:35355): pid=15491 uid=1000 auid=1000 ses=1 msg='op=PAM:authentication grantors=pam_localuser acct="root" exe="/usr/sbin/userhelper" hostname=? addr=? terminal=? res=success' Mar 08 17:07:17 dom0 kernel: audit: type=1101 audit(1520496437.757:35356): pid=15491 uid=1000 auid=1000 ses=1 msg='op=PAM:accounting grantors=pam_permit acct="root" exe="/usr/sbin/userhelper" hostname=? addr=? terminal=? res=success' Mar 08 17:07:17 dom0 userhelper[15491]: running '/usr/sbin/hddtemp -n -q /dev/sda' with root privileges on behalf of 'singingstar' Mar 08 17:07:18 dom0 audit[15493]: USER_AUTH pid=15493 uid=1000 auid=1000 ses=1 msg='op=PAM:authentication grantors=pam_localuser acct="root" exe="/usr/sbin/userhelper" hostname=? addr=? terminal=? res=success' Mar 08 17:07:18 dom0 audit[15493]: USER_ACCT pid=15493 uid=1000 auid=1000 ses=1 msg='op=PAM:accounting grantors=pam_permit acct="root" exe="/usr/sbin/userhelper" hostname=? addr=? terminal=? res=success' Mar 08 17:07:18 dom0 userhelper[15493]: running '/usr/sbin/hddtemp -n -q /dev/sda' with root privileges on behalf of 'singingstar' -- Reboot -- Mar 08 17:10:12 dom0 systemd-journal[248]: Runtime journal is using 8.0M (max allowed 196.3M, trying to leave 294.4M free of 1.9G available → current limit 196.3M). Mar 08 17:10:12 dom0 systemd-journal[248]: Runtime journal is using 8.0M (max allowed 196.3M, trying to leave 294.4M free of 1.9G available → current limit 196.3M). Mar 08 17:10:12 dom0 kernel: Linux version 4.14.18-1.pvops.qubes.x86_64 (user@) (gcc version 5.3.1 20160406 (Red Hat 5.3.1-6) (GCC)) #1 SMP Thu Feb 8 19:37:36 UTC 2018

Note that there was some dependency problem while installing new qubes kernel

marmarek commented 6 years ago

@HW42 I'm pretty sure it's block backend that takes whatever type of memory is "missing". Not exactly sure if that is block backend's bug. It may be that kernel should keep some bigger reserves or something. This may be also related to using multiple queues (new kernels use 4 per device, I think older use 1 or 2), so newer kernel simply takes more memory.

Polygonbugs commented 6 years ago

This is already fixed in Qubes 4.0. But the fix (or actually - the update procedure) is too fragile to backport it to 3.2, because it involve renaming /var/run/shm.id - which is referenced from various places.

@marmarek, Does this same case of oom-killer? I'm quite wondering if it is different case, how could this impact 4.0? (though you've already implemented solution on 4.0). BTW, is anything I can help you with?

marmarek commented 6 years ago

The oom-killer issue is separate than the one referred above (restarting X server and having applications back). The oom-killer one is still and issue, without known solution yet.

Polygonbugs commented 6 years ago

@marmarek, I think that they happened on similar circumstance. They occurred when system RAM is almost out of, OS is power on at least an hour, and opening VM with big(>4096MB) RAM attached. Though the last factor is not mandatory since oom-killer happened without opening new VM. However the first one is. So, does increasing physical RAM to system could mitigate problems? If it is, then adding RAM would increase system stability.

Polygonbugs commented 6 years ago

Well, some are possible to be relaxed after a while but the one happened Mar 8 makes user to manually reboot system to recover. Apparaently, no oom-killer so far after changing kernel.

I'll test 4.0 if stable release come out. All system resources would be same as 3.2.

marmarek commented 6 years ago

This haven't happened for me for over a month. Looks like some kernel update have fixed it.

andrewdavidwong commented 6 years ago

I just had another one of these:

Jun 20 21:18:28 dom0 kernel: Out of memory: Kill process 3388 (Xorg) score 9 or sacrifice child
Jun 20 21:18:28 dom0 kernel: Killed process 3388 (Xorg) total-vm:457280kB, anon-rss:37592kB, file-rss:6692kB, shmem-rss:99764kB
Jun 20 21:18:28 dom0 org.freedesktop.Notifications[3449]: xfce4-notifyd: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
Jun 20 21:18:28 dom0 kernel: do_trap: 27 callbacks suppressed
Jun 20 21:18:28 dom0 kernel: traps: xss-lock[3515] trap int3 ip:773dfa28d86b sp:7ffc850b08d0 error:0
Jun 20 21:18:28 dom0 kernel:  in libglib-2.0.so.0.4600.2[773dfa23d000+137000]
Jun 20 21:18:28 dom0 kernel: audit: type=1701 audit(1529547508.337:18574): auid=1000 uid=1000 gid=1001 ses=1 pid=3515 comm="xss-lock" exe="/usr/bin/xss-lock" sig=5
Jun 20 21:18:28 dom0 audit[3515]: ANOM_ABEND auid=1000 uid=1000 gid=1001 ses=1 pid=3515 comm="xss-lock" exe="/usr/bin/xss-lock" sig=5
Jun 20 21:18:28 dom0 org.a11y.atspi.Registry[3520]: XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
Jun 20 21:18:28 dom0 org.a11y.atspi.Registry[3520]: after 293257 requests (293257 known processed) with 0 events remaining.
Jun 20 21:18:28 dom0 kernel: oom_reaper: reaped process 3388 (Xorg), now anon-rss:0kB, file-rss:0kB, shmem-rss:99848kB
Jun 20 21:18:28 dom0 libvirtd[1651]: End of file while reading data: Input/output error
Jun 20 21:18:28 dom0 libvirtd[1651]: End of file while reading data: Input/output error
Jun 20 21:18:28 dom0 polkitd[3396]: Unregistered Authentication Agent for unix-session:1 (system bus name :1.16, object path /org/gnome/PolicyKit1/AuthenticationAgent, locale en_US.utf8) (di
Jun 20 21:18:28 dom0 org.gtk.vfs.Daemon[3449]: A connection to the bus can't be made
Jun 20 21:18:28 dom0 audit[3431]: USER_LOGOUT pid=3431 uid=0 auid=1000 ses=1 msg='op=logout id=1000 exe="/usr/sbin/lightdm" hostname=? addr=? terminal=/dev/tty1 res=success'
Jun 20 21:18:28 dom0 kernel: audit: type=1113 audit(1529547508.414:18575): pid=3431 uid=0 auid=1000 ses=1 msg='op=logout id=1000 exe="/usr/sbin/lightdm" hostname=? addr=? terminal=/dev/tty1 
Jun 20 21:18:28 dom0 lightdm[3431]: pam_unix(lightdm:session): session closed for user user
Jun 20 21:18:28 dom0 audit[3431]: USER_END pid=3431 uid=0 auid=1000 ses=1 msg='op=PAM:session_close grantors=pam_selinux,pam_loginuid,pam_selinux,pam_keyinit,pam_namespace,pam_keyinit,pam_li
Jun 20 21:18:28 dom0 kernel: audit: type=1106 audit(1529547508.440:18576): pid=3431 uid=0 auid=1000 ses=1 msg='op=PAM:session_close grantors=pam_selinux,pam_loginuid,pam_selinux,pam_keyinit,
Jun 20 21:18:28 dom0 audit[3431]: CRED_DISP pid=3431 uid=0 auid=1000 ses=1 msg='op=PAM:setcred grantors=pam_unix acct="user" exe="/usr/sbin/lightdm" hostname=? addr=? terminal=:0 res=success'
Jun 20 21:18:28 dom0 kernel: audit: type=1104 audit(1529547508.441:18577): pid=3431 uid=0 auid=1000 ses=1 msg='op=PAM:setcred grantors=pam_unix acct="user" exe="/usr/sbin/lightdm" hostname=? 
Jun 20 21:18:29 dom0 audit[22346]: CRED_ACQ pid=22346 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_permit acct="lightdm" exe="/usr/sbin/lightdm" hostname=? a
Jun 20 21:18:29 dom0 kernel: audit: type=1103 audit(1529547509.749:18578): pid=22346 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_permit acct="lightdm" exe="
Jun 20 21:18:29 dom0 lightdm[22346]: pam_unix(lightdm-greeter:session): session opened for user lightdm by (uid=0)
Jun 20 21:18:29 dom0 systemd[1]: Started Session c1210 of user lightdm.
Jun 20 21:18:29 dom0 systemd-logind[1629]: New session c1210 of user lightdm.
Jun 20 21:18:29 dom0 systemd[1]: Starting Session c1210 of user lightdm.
Jun 20 21:18:29 dom0 kernel: audit: type=1105 audit(1529547509.777:18579): pid=22346 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:session_open grantors=pam_unix,pam_systemd acct="lightdm
Jun 20 21:18:29 dom0 audit[22346]: USER_START pid=22346 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:session_open grantors=pam_unix,pam_systemd acct="lightdm" exe="/usr/sbin/lightdm" hos
Jun 20 21:18:30 dom0 systemd-coredump[22339]: Process 3515 (xss-lock) of user 1000 dumped core.

                                              Stack trace of thread 3515:
                                              #0  0x0000773dfa28d86b g_logv (libglib-2.0.so.0)
                                              #1  0x0000773dfa28d9df g_log (libglib-2.0.so.0)
                                              #2  0x00005642c449788d screensaver_event_cb (xss-lock)
                                              #3  0x00005642c44979d7 xcb_event_dispatch (xss-lock)
                                              #4  0x0000773dfa286e5a g_main_context_dispatch (libglib-2.0.so.0)
                                              #5  0x0000773dfa2871f0 g_main_context_iterate.isra.29 (libglib-2.0.so.0)
                                              #6  0x0000773dfa287512 g_main_loop_run (libglib-2.0.so.0)
                                              #7  0x00005642c4496c5e main (xss-lock)
                                              #8  0x0000773df9871580 __libc_start_main (libc.so.6)
                                              #9  0x00005642c4496d69 _start (xss-lock)

                                              Stack trace of thread 3522:
                                              #0  0x0000773df9947b7d poll (libc.so.6)
                                              #1  0x0000773dfa28718c g_main_context_iterate.isra.29 (libglib-2.0.so.0)
                                              #2  0x0000773dfa28729c g_main_context_iteration (libglib-2.0.so.0)
                                              #3  0x0000773dfa2872d9 glib_worker_main (libglib-2.0.so.0)
                                              #4  0x0000773dfa2ad8e5 g_thread_proxy (libglib-2.0.so.0)
                                              #5  0x0000773df8bda61a start_thread (libpthread.so.0)
                                              #6  0x0000773df99535fd __clone (libc.so.6)

                                              Stack trace of thread 3528:
                                              #0  0x0000773df9947b7d poll (libc.so.6)
                                              #1  0x0000773dfa28718c g_main_context_iterate.isra.29 (libglib-2.0.so.0)
                                              #2  0x0000773dfa287512 g_main_loop_run (libglib-2.0.so.0)
                                              #3  0x0000773dfa8a8516 gdbus_shared_thread_func (libgio-2.0.so.0)
                                              #4  0x0000773dfa2ad8e5 g_thread_proxy (libglib-2.0.so.0)
                                              #5  0x0000773df8bda61a start_thread (libpthread.so.0)
                                              #6  0x0000773df99535fd __clone (libc.so.6)
Jun 20 21:18:31 dom0 systemd-logind[1629]: Removed session c1.
andrewdavidwong commented 6 years ago

Happened again: https://gist.github.com/andrewdavidwong/ca46feda8a7c158773d9d2acb54752ae

indeed-a-genius commented 5 years ago

Having this issue on Linux Mint as well, kernel 4.18.0-13. Relevant errors:

gnome-terminal-[6663]: gnome-terminal-server: Fatal IO error 11 (Resource temporarily unavailable) on X server :0.
org.a11y.Bus[1458]: XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
dbus-daemon[10573]: [session uid=1000 pid=10571] Activated service 'org.freedesktop.systemd1' failed: Process org.freedesktop.systemd1 exited with status 1
marmarek commented 5 years ago

This may be a fix: https://lists.xenproject.org/archives/html/xen-devel/2019-05/msg00273.html

andrewdavidwong commented 5 years ago

FWIW, I have not observed this on my system for a long time. Not sure if anything has changed, or just been lucky.

marmarek commented 5 years ago

I haven't seen it either. The above linked patch is scheduled for Linux 5.2, so it isn't it the change that made it go away. But there are quite a lot of fixes in stable Linux kernels and probably some of them fixed it. Lets close it for now, if that happens again we can always reopen.