Closed djwatson closed 8 years ago
This one looks like a race between receiving a new message, and close() locally. You can kinda see it in the interleaved logs
[ 372.859537] tls: --> tls_rx_async_work [ 372.860805] tls: --> tls_peek_data [ 373.860974] tls: [ 373.861021] tls: --> tls_release [ 373.861023] tls: --> tls_free_sendpage_ctx [ 373.861024] tls: --> tls_sock_destruct [ 373.861026] tls: parallel executions: 2 [ 373.867009] --> tls_data_ready [ 373.868162] tls: --> tls_rx_async_work [ 373.870335] tls: --> tls_peek_data [ 373.871089] BUG: unable to handle kernel NULL pointer dereference at 0000000000000090 [ 373.873204] IP: [] sock_recvmsg+0x45/0x60 [ 373.874106] PGD 0 [ 373.874430] ------------[ cut here ]------------ [ 373.875749] Oops: 0000 [#1] SMP [ 373.876466] Last file read FILE* = ffff8823f8401b00 [ 373.887355] Modules linked in: sha256_generic drbg af_ktls(O) tcp_diag inet_diag ip6table_filter xt_NFLOG xt_comment iptable_filter netconsole autofs4 hwmon_vid w83795 i2c_piix4 rp csec_gss_krb5 auth_rpcgss oid_registry dm_mod loop sg serio_raw iTCO_wdt iTCO_vendor_support e1000e ipmi_devintf x86_pkg_temp_thermal coretemp kvm irqbypass crc32c_intel aesni_intel a blk_helper cryptd lrw gf128mul glue_helper aes_x86_64 pcspkr i2c_i801 i2c_core lpc_ich mfd_core ehci_pci ehci_hcd ipmi_si ipmi_msghandler shpchp button [ 373.896046] CPU: 2 PID: 333 Comm: kworker/2:1 Tainted: G O 4.6.0-rc6_00054_g5294e32 #117 [ 373.897631] Hardware name: Quanta Freedom/Winterfell, BIOS F03_3B09 05/22/2014 [ 373.898850] Workqueue: ktls tls_rx_async_work [af_ktls] [ 373.899694] task: ffff881228b96200 ti: ffff881228470000 task.ti: ffff881228470000 [ 373.901004] RIP: 0010:[] [] sock_recvmsg+0x45/0x60 [ 373.903504] RSP: 0018:ffff881228473a88 EFLAGS: 00010246 [ 373.904457] RAX: 0000000000000000 RBX: ffff8811fd726400 RCX: 0000000000000042 [ 373.912814] RDX: 000000000000401d RSI: ffff881228473b28 RDI: ffff8811fd726400 [ 373.913964] RBP: ffff881228473aa8 R08: 000000000000401d R09: 0000000000000042 [ 373.915112] R10: 000000000000003f R11: 0000000000000259 R12: ffff881228473b28 [ 373.916352] R13: 000000000000401d R14: 0000000000000042 R15: 0000000000000042 [ 373.917616] FS: 0000000000000000(0000) GS:ffff881237880000(0000) knlGS:0000000000000000 [ 373.920577] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 373.921502] CR2: 0000000000000090 CR3: 0000000001e06000 CR4: 00000000001406e0 [ 373.922753] Stack: [ 373.923089] ffff881228474000 ffff881228473b28 000000000000401d ffffffffffffffff [ 373.924293] ffff881228473af8 ffffffff817de7b9 ffff881228473b58 ffff8811fd726400 [ 373.925505] 0000000000000001 ffff881225fcd000 ffff881225fcd94b 0000000000000042 [ 373.926700] Call Trace: [ 373.927212] [] kernel_recvmsg+0x69/0x90 [ 373.928145] [] tls_peek_data+0xab/0x200 [af_ktls] [ 373.930195] [] tls_rx_async_work+0x113/0x1b0 [af_ktls] [ 373.931298] [] process_one_work+0x16c/0x4f0 [ 373.937924] [] ? __schedule+0x36a/0x9d0 [ 373.946157] [] ? schedule+0x40/0xb0
Indeed, there should be a socket lock on bound socket. when close(2) is called. AF_KTLS socket cannot be locked in async worker due to dead lock that can occur with kernel_recvmsg().
close(2)
kernel_recvmsg()
This one looks like a race between receiving a new message, and close() locally. You can kinda see it in the interleaved logs