Open Spongman opened 1 year ago
this is the host crashing. CPU: i3-8100
[ 0.000000] Linux version 5.19.17-Unraid (root@Develop) (gcc (GCC) 12.2.0, GNU ld version 2.39-slack151) #2 SMP PREEMPT_DYNAMIC Wed Nov 2 11:54:15 PDT 2022 [ 0.000000] Command line: BOOT_IMAGE=/bzimage initrd=/bzroot ... [ 98.835436] Setting dangerous option enable_guc - tainting kernel [ 98.835440] Setting dangerous option force_probe - tainting kernel [ 98.835916] i915 0000:00:02.0: [drm] Incompatible option enable_guc=3 - GuC submission is N/A [ 98.836490] i915 0000:00:02.0: [drm] VT-d active for gfx access [ 98.836588] Console: switching to colour dummy device 80x25 [ 98.836624] i915 0000:00:02.0: vgaarb: deactivate vga console [ 98.836651] i915 0000:00:02.0: [drm] Transparent Hugepage mode 'huge=within_size' [ 98.838131] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem [ 98.841126] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4) [ 98.856491] i915 0000:00:02.0: [drm] GuC firmware i915/kbl_guc_70.1.1.bin version 70.1 [ 98.856495] i915 0000:00:02.0: [drm] HuC firmware i915/kbl_huc_4.0.0.bin version 4.0 [ 98.880358] i915 0000:00:02.0: [drm] HuC authenticated [ 98.880362] i915 0000:00:02.0: [drm] GuC submission disabled [ 98.880363] i915 0000:00:02.0: [drm] GuC SLPC disabled [ 98.901249] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0 [ 98.903109] ACPI: video: Video Device [GFX0] (multi-head: yes rom: no post: no) [ 98.903479] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input7 [ 98.937135] fbcon: i915drmfb (fb0) is primary device [ 98.950311] Console: switching to colour frame buffer device 200x56 [ 98.967845] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device [ 99.016331] i915 0000:00:02.0: Direct firmware load for i915/gvt/vid_0x8086_did_0x3e91_rid_0x00.golden_hw_state failed with error -2 [ 99.018239] i915 0000:00:02.0: MDEV: Registered ... [ 225.318583] i915 0000:00:02.0: MDEV: Unregistering [ 225.318663] BUG: kernel NULL pointer dereference, address: 0000000000000098 [ 225.318667] #PF: supervisor write access in kernel mode [ 225.318670] #PF: error_code(0x0002) - not-present page [ 225.318673] PGD 0 P4D 0 [ 225.318676] Oops: 0002 [#1] PREEMPT SMP PTI [ 225.318679] CPU: 3 PID: 2849 Comm: rpc-libvirtd Tainted: G U 5.19.17-Unraid #2 [ 225.318684] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z390M-ITX/ac, BIOS P4.20 08/05/2019 [ 225.318688] RIP: 0010:rwsem_write_trylock+0x7/0x23 [ 225.318694] Code: cc cc cc 48 8b 47 08 a8 01 74 13 a8 02 75 0f 48 89 c2 48 83 ca 02 f0 48 0f b1 57 08 75 e9 c3 cc cc cc cc 31 c0 ba 01 00 00 00 <f0> 48 0f b1 17 0f 94 c0 75 0d 65 48 8b 14 25 c0 bb 01 00 48 89 57 [ 225.318701] RSP: 0018:ffffc900019d3be8 EFLAGS: 00010246 [ 225.318704] RAX: 0000000000000000 RBX: 0000000000000098 RCX: 0000000000000064 [ 225.318707] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000098 [ 225.318710] RBP: ffffc900019d3c90 R08: ffff888154795180 R09: 0000000080150012 [ 225.318713] R10: ffff888154795180 R11: 0000000000000000 R12: ffffffffa0956560 [ 225.318716] R13: ffff8881056c5d80 R14: ffff8881056c5d80 R15: ffff888105073260 [ 225.318720] FS: 0000153f3a7656c0(0000) GS:ffff88845e180000(0000) knlGS:0000000000000000 [ 225.318724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 225.318726] CR2: 0000000000000098 CR3: 00000001050d4006 CR4: 00000000003706e0 [ 225.318730] Call Trace: [ 225.318733] <TASK> [ 225.318734] __down_write_common+0x31/0x4e9 [ 225.318740] ? __slab_free+0x83/0x29a [ 225.318746] simple_recursive_removal+0x3f/0x246 [ 225.318750] ? debug_mount+0x26/0x26 [ 225.318753] ? preempt_latency_start+0x2b/0x46 [ 225.318758] ? mntget+0x1c/0x25 [ 225.318761] debugfs_remove+0x40/0x5f [ 225.318765] intel_gvt_debugfs_clean+0x15/0x24 [kvmgt] [ 225.318776] intel_gvt_clean_device+0x4c/0xd2 [kvmgt] [ 225.318787] intel_gvt_clean_device+0x20/0x53 [i915] [ 225.318864] intel_gvt_driver_remove+0x1d/0x5b [i915] [ 225.318931] i915_driver_remove+0x87/0xc9 [i915] [ 225.318984] i915_pci_remove+0x1a/0x29 [i915] [ 225.319037] pci_device_remove+0x33/0x89 [ 225.319043] device_release_driver_internal+0xbc/0x13e [ 225.319048] unbind_store+0x54/0x74 [ 225.319051] kernfs_fop_write_iter+0x134/0x17f [ 225.319056] new_sync_write+0x7c/0xbb [ 225.319061] ? intel_pmu_lbr_read_64+0x115/0x232 [ 225.319066] vfs_write+0xda/0x129 [ 225.319069] ksys_write+0x76/0xc2 [ 225.319073] do_syscall_64+0x68/0x81 [ 225.319077] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 225.319082] RIP: 0033:0x153f3bf5d42f [ 225.319085] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 19 2d f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 6c 2d f8 ff 48 [ 225.319092] RSP: 002b:0000153f3a764420 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 225.319096] RAX: ffffffffffffffda RBX: 0000000000000020 RCX: 0000153f3bf5d42f [ 225.319099] RDX: 000000000000000c RSI: 0000153f2c037e70 RDI: 0000000000000020 [ 225.319102] RBP: 000000000000000c R08: 0000000000000000 R09: 0000153f2c03c930 [ 225.319105] R10: 0000000000000000 R11: 0000000000000293 R12: 0000153f2c037e70 [ 225.319108] R13: 0000000000000020 R14: 0000000000000000 R15: 0000153f3c6bc4e8 [ 225.319114] </TASK> [ 225.319115] Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfsmd_mod kvmgt mdev i915 iosf_mbi drm_buddy ttm drm_display_helper drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops efivarfs ip6tabl e_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp bridge stp llc bonding tls igb i2c_algo_bit e1000e x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel nvme i2c_i801 crypto_simd wmi_bmof cryptd rapl intel_cstate intel_uncore i2c_smbus nvme_core i2c_core input_leds led_class joydev ahci intel_pch_thermal libahci wmi video backlight button acpi_tad acpi_pad unix [last unloaded: i2 c_algo_bit] [ 225.319205] CR2: 0000000000000098 [ 225.319208] ---[ end trace 0000000000000000 ]--- [ 225.934871] RIP: 0010:rwsem_write_trylock+0x7/0x23 [ 225.934879] Code: cc cc cc 48 8b 47 08 a8 01 74 13 a8 02 75 0f 48 89 c2 48 83 ca 02 f0 48 0f b1 57 08 75 e9 c3 cc cc cc cc 31 c0 ba 01 00 00 00 <f0> 48 0f b1 17 0f 94 c0 75 0d 65 48 8b 14 25 c0 bb 01 00 48 89 57 [ 225.934886] RSP: 0018:ffffc900019d3be8 EFLAGS: 00010246 [ 225.934890] RAX: 0000000000000000 RBX: 0000000000000098 RCX: 0000000000000064 [ 225.934893] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000098 [ 225.934896] RBP: ffffc900019d3c90 R08: ffff888154795180 R09: 0000000080150012 [ 225.934899] R10: ffff888154795180 R11: 0000000000000000 R12: ffffffffa0956560 [ 225.934902] R13: ffff8881056c5d80 R14: ffff8881056c5d80 R15: ffff888105073260 [ 225.934905] FS: 0000153f3a7656c0(0000) GS:ffff88845e180000(0000) knlGS:0000000000000000 [ 225.934909] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 225.934912] CR2: 0000000000000098 CR3: 00000001050d4006 CR4: 00000000003706e0
what's the step to trigger this? Looks you're unloading i915 driver? Does it happen on latest kernel?
i don't know. i'm using the unraid GVT-g plugin. i don't think i can upgrade the kernel.
this is the host crashing. CPU: i3-8100