Xilinx / XRT

Run Time for AIE and FPGA based platforms
https://xilinx.github.io/XRT
Other
561 stars 476 forks source link

Panics for kick off kernel run without downloading bitstream. #725

Closed houlz0507 closed 3 years ago

houlz0507 commented 6 years ago

in sdaccel.ini xclbin_programing=false

This trigger 2 panics

[ 463.705341] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 463.706052] IP: [] xocl_create_bo_ioctl+0x65/0x2e0 [xocl] [ 463.706766] PGD 800000042d2b9067 PUD 41c76f067 PMD 0 [ 463.707470] Oops: 0000 [#1] SMP [ 463.708171] Modules linked in: xocl(OE) xclmgmt(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink sunrpc intel_powerclamp coretemp intel_rapl kvm ftdi_sio irqbypass joydev eeepc_wmi crc32_pclmul asus_wmi ghash_clmulni_intel sparse_keymap aesni_intel rfkill lrw gf128mul iTCO_wdt shpchp mxm_wmi glue_helper iTCO_vendor_support mei_me sg ablk_helper cryptd mei wmi i2c_i801 acpi_pad pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic hid_logitech_dj i915 iosf_mbi i2c_algo_bit drm_kms_helper ahci syscopyarea sysfillrect sysimgblt fb_sys_fops [ 463.710583] libahci e1000e drm libata crct10dif_pclmul crct10dif_common crc32c_intel ptp serio_raw pps_core i2c_hid video i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: xocl] [ 463.712251] CPU: 0 PID: 3414 Comm: xil_gzip Kdump: loaded Tainted: G OE ------------ 3.10.0-862.11.6.el7.x86_64 #1 [ 463.713103] Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 0607 01/08/2018 [ 463.713975] task: ffff9be4bbba2f70 ti: ffff9be462ed8000 task.ti: ffff9be462ed8000 [ 463.714835] RIP: 0010:[] [] xocl_create_bo_ioctl+0x65/0x2e0 [xocl] [ 463.715705] RSP: 0018:ffff9be462edbd00 EFLAGS: 00010246 [ 463.716568] RAX: 0000000000000000 RBX: ffff9be3d330c018 RCX: 0000000000000000 [ 463.717438] RDX: 000000000001ceec RSI: ffffffff8fc7cda9 RDI: ffff9be1fa21e4a0 [ 463.718308] RBP: ffff9be462edbd30 R08: 0000000000000000 R09: ffff9be1fa21e4a0 [ 463.719177] R10: ffff9be0b618e300 R11: ffff9be462edbdc8 R12: ffff9be462edbdc8 [ 463.720040] R13: 0000000000000000 R14: ffff9be0b5f75e00 R15: ffff9be0b5f77000 [ 463.720898] FS: 00007f21043457c0(0000) GS:ffff9be4be200000(0000) knlGS:0000000000000000 [ 463.721757] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 463.722614] CR2: 0000000000000018 CR3: 0000000439326000 CR4: 00000000003607f0 [ 463.723472] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 463.724324] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 463.725171] Call Trace: [ 463.726009] [] ? unlock_page+0x2b/0x30 [ 463.726852] [] ? xocl_create_bo+0x420/0x420 [xocl] [ 463.727693] [] drm_ioctl_kernel+0x6c/0xb0 [drm] [ 463.728517] [] drm_ioctl+0x1f0/0x450 [drm] [ 463.729334] [] ? xocl_create_bo+0x420/0x420 [xocl] [ 463.730156] [] ? handle_mm_fault+0x39d/0x9b0 [ 463.730964] [] ? vma_merge+0x252/0x370 [ 463.731769] [] do_vfs_ioctl+0x360/0x550 [ 463.732566] [] ? __do_page_fault+0x1bc/0x4f0 [ 463.733361] [] SyS_ioctl+0xa1/0xc0 [ 463.734154] [] system_call_fastpath+0x22/0x27 [ 463.734942] Code: 51 02 00 00 41 8b 54 24 0c 49 8b 34 24 e8 94 fb ff ff 48 3d 00 f0 ff ff 49 89 c7 0f 87 55 02 00 00 48 8b 83 68 04 00 00 45 85 ed <48> 8b 50 18 0f 84 f1 00 00 00 49 8b 87 f0 00 00 00 48 8b 48 08 [ 463.735844] RIP [] xocl_create_bo_ioctl+0x65/0x2e0 [xocl] [ 463.736681] RSP [ 463.737507] CR2: 0000000000000018

[ 8586.009903] xocl_qdma 0000:01:00.1: validate: validate got unexpected command opcode(2) [ 8586.010593] ------------[ cut here ]------------ [ 8586.011234] kernel BUG at /scratch/XRT/build/Debug/usr/src/xrt-2.1.0/driver/xclng/drm/xocl/userpf/../subdev/icap.c:2050! [ 8586.011898] invalid opcode: 0000 [#1] SMP [ 8586.012563] Modules linked in: xclmgmt(OE) xocl(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink sunrpc intel_powerclamp coretemp intel_rapl kvm irqbypass ftdi_sio crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul joydev glue_helper sg ablk_helper eeepc_wmi asus_wmi cryptd sparse_keymap iTCO_wdt rfkill iTCO_vendor_support mxm_wmi acpi_pad mei_me wmi shpchp i2c_i801 mei pcspkr binfmt_misc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic hid_logitech_dj i915 iosf_mbi i2c_algo_bit drm_kms_helper ahci syscopyarea sysfillrect sysimgblt fb_sys_fops [ 8586.014828] libahci e1000e drm libata crct10dif_pclmul crct10dif_common crc32c_intel ptp serio_raw pps_core i2c_hid video i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: xclmgmt] [ 8586.016393] CPU: 2 PID: 14450 Comm: xil_gzip Kdump: loaded Tainted: G OE ------------ 3.10.0-862.11.6.el7.x86_64 #1 [ 8586.017177] Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 0607 01/08/2018 [ 8586.017971] task: ffff8fb439b40000 ti: ffff8fb42479c000 task.ti: ffff8fb42479c000 [ 8586.018771] RIP: 0010:[] [] icap_lock_bitstream+0x193/0x1a0 [xocl] [ 8586.019582] RSP: 0018:ffff8fb42479fc38 EFLAGS: 00010246 [ 8586.020383] RAX: 0000000000000000 RBX: 0000000000003872 RCX: 000000000000000f [ 8586.021188] RDX: 000000000000000f RSI: ffffffffa900b090 RDI: ffff8fb0a185b558 [ 8586.021995] RBP: ffff8fb42479fc68 R08: 0000000000000000 R09: 0000000000000000 [ 8586.022804] R10: 00000000000005f7 R11: ffffbb3100819ff8 R12: ffff8fb0a185b558 [ 8586.023616] R13: ffff8fb42479fdc8 R14: ffff8fb42377fe58 R15: ffff8fb0a18549c0 [ 8586.024431] FS: 00007fe77856d7c0(0000) GS:ffff8fb43e280000(0000) knlGS:0000000000000000 [ 8586.025253] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8586.026079] CR2: 00007fe778599000 CR3: 00000002eaaec000 CR4: 00000000003607e0 [ 8586.026943] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8586.027777] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8586.028610] Call Trace: [ 8586.029441] [] xocl_client_lock_bitstream.part.2+0x5a/0x200 [xocl] [ 8586.030275] [] xocl_execbuf_ioctl+0x1f4/0x2c0 [xocl] [ 8586.031102] [] ? xocl_info_ioctl+0xe0/0xe0 [xocl] [ 8586.031940] [] drm_ioctl_kernel+0x6c/0xb0 [drm] [ 8586.032774] [] ? generic_file_aio_write+0x77/0xa0 [ 8586.033612] [] drm_ioctl+0x1f0/0x450 [drm] [ 8586.034417] [] ? xocl_info_ioctl+0xe0/0xe0 [xocl] [ 8586.035248] [] ? do_sync_write+0x93/0xe0 [ 8586.036086] [] do_vfs_ioctl+0x360/0x550 [ 8586.036920] [] ? __sb_end_write+0x31/0x60 [ 8586.037757] [] ? vfs_write+0x182/0x1f0 [ 8586.038589] [] SyS_ioctl+0xa1/0xc0 [ 8586.039397] [] system_call_fastpath+0x22/0x27

uday610 commented 5 years ago

@houlz0507 , is this done in 2018.3?

keryell commented 5 years ago

@stsoe What is the status of this?

stsoe commented 5 years ago

No idea?

keryell commented 5 years ago

@stsoe You put it on the https://github.com/Xilinx/XRT/milestone/2 milestone that shipped long time ago, so I was hoping it was no longer an issue.. :-)

uday610 commented 3 years ago

No longer issue

uday610 commented 3 years ago

@houlz0507 , do you know anything about this issue? is this still relevant?

houlz0507 commented 3 years ago

tried this. It seems gone.