openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.64k stars 1.75k forks source link

kernel NULL pointer dereference when using L2arc #6945

Closed twarberg closed 6 years ago

twarberg commented 6 years ago

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 16.04 LTS
Linux Kernel 4.10.0-42-generic
Architecture x86_64
ZFS Version 0.7.3
SPL Version 0.7.3

Describe the problem you're observing

We're running on Google Cloud and is trying to use an local NVMe SSD for L2arc but run into the crash after 7-12 days. We've seen it on 2 different systems a MySQL and a Postgres server. It happened on the Postgres server with both 0.7.1 and 0.7.2 with about a weeks interval (2x on 0.7.1 and once on 0.7.2). Unfortunately we do not have console logs for those which is why we haven't reported it until now. We decided to give 0.7.3 a try after upgrading the MySQL server and got a crash after 10 days. Both systems are busy 24/7 with the Postgres being write heavy and the MySQL read heavy. Postgres is at least 2x as disk active. The Postgres server has been running stable without L2arc for months.

Common settings

Describe how to reproduce the problem

Run a busy MySQL or Postgres on Google Cloud instance with local NVMe SSD L2arc for 7-12 days.

Include any warning/errors/backtraces from the system logs

[896673.398682] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[896673.406756] IP: kmem_cache_alloc+0x77/0x1b0
[896673.411138] PGD b1f993067
[896673.411139] PUD 6d76be067
[896673.414041] PMD 0
[896673.416944]
[896673.420828] Oops: 0000 [#1] SMP
[896673.424157] Modules linked in: ufs msdos xfs tcp_diag inet_diag binfmt_misc zfs(POE) zunicode(POE) zavl(PO) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev input_leds parport_pc parport pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse nvme virtio_net nvme_core virtio_scsi
[896673.481944] CPU: 5 PID: 19051 Comm: arc_reclaim Tainted: P           OE   4.10.0-40-generic #44~16.04.1-Ubuntu
[896673.492129] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[896673.501532] task: ffff92dd83781d40 task.stack: ffffbb4507004000
[896673.507640] RIP: 0010:kmem_cache_alloc+0x77/0x1b0
[896673.512528] RSP: 0018:ffffbb4507007c38 EFLAGS: 00010206
[896673.517938] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000090eebb2a
[896673.525258] RDX: 0000000090eebb29 RSI: 0000000001404220 RDI: 000000000001c6a0
[896673.532579] RBP: ffffbb4507007c68 R08: ffff92ddbfd5c6a0 R09: 0000000000000018
[896673.539899] R10: ffff92dd60bac7b0 R11: 0000000000000000 R12: 0000000001404220
[896673.547223] R13: ffffffffc06a12b2 R14: ffff92dd8b0037c0 R15: ffff92dd8b0037c0
[896673.554550] FS:  0000000000000000(0000) GS:ffff92ddbfd40000(0000) knlGS:0000000000000000
[896673.562825] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[896673.568757] CR2: 0000000000000018 CR3: 00000006c8c6d000 CR4: 00000000001406e0
[896673.576078] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[896673.583398] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[896673.590719] Call Trace:
[896673.593366]  spl_kmem_cache_alloc+0x72/0x7d0 [spl]
[896673.598348]  ? kmem_cache_free+0x1cd/0x1e0
[896673.602675]  ? arc_state_multilist_index_func+0x4c/0x60 [zfs]
[896673.608630]  arc_hdr_realloc+0x31/0x270 [zfs]
[896673.613217]  arc_evict_state+0x4fb/0x870 [zfs]
[896673.617867]  arc_adjust+0x4af/0x690 [zfs]
[896673.622068]  ? kvm_clock_get_cycles+0x1e/0x20
[896673.626637]  arc_reclaim_thread+0xab/0x280 [zfs]
[896673.631458]  ? arc_shrink+0xb0/0xb0 [zfs]
[896673.635666]  thread_generic_wrapper+0x72/0x80 [spl]
[896673.640733]  kthread+0x109/0x140
[896673.644149]  ? __thread_exit+0x20/0x20 [spl]
[896673.648607]  ? kthread_create_on_node+0x60/0x60
[896673.653334]  ret_from_fork+0x2c/0x40
[896673.657096] Code: 08 65 4c 03 05 2b e1 be 61 49 83 78 10 00 4d 8b 08 0f 84 f8 00 00 00 4d 85 c9 0f 84 ef 00 00 00 49 63 47 20 48 8d 4a 01 49 8b 3f <49> 8b 1c 01 4c 89 c8 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 49 63
[896673.676170] RIP: kmem_cache_alloc+0x77/0x1b0 RSP: ffffbb4507007c38
[896673.682537] CR2: 0000000000000018
[896673.686040] ---[ end trace 4de68d0227766242 ]---

[896674.140835] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[896674.149026] IP: kmem_cache_alloc_trace+0x7b/0x1c0
[896674.153915] PGD b1f993067
[896674.153915] PUD 6d76be067
[896674.156816] PMD 0
[896674.159809]
[896674.163680] Oops: 0000 [#2] SMP
[896674.167006] Modules linked in: ufs msdos xfs tcp_diag inet_diag binfmt_misc zfs(POE) zunicode(POE) zavl(PO) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev input_leds parport_pc parport pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse nvme virtio_net nvme_core virtio_scsi
[896674.224935] CPU: 5 PID: 25120 Comm: kthreadd Tainted: P      D    OE   4.10.0-40-generic #44~16.04.1-Ubuntu
[896674.234862] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[896674.244267] task: ffff92d5aefaba80 task.stack: ffffbb450f680000
[896674.250376] RIP: 0010:kmem_cache_alloc_trace+0x7b/0x1c0
[896674.255785] RSP: 0018:ffffbb450f683eb8 EFLAGS: 00010206
[896674.261195] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000090eebb2a
[896674.268513] RDX: 0000000090eebb29 RSI: 00000000014000c0 RDI: 000000000001c6a0
[896674.275844] RBP: ffffbb450f683ef8 R08: ffff92ddbfd5c6a0 R09: ffff92dd8b0037c0
[896674.283165] R10: 0000000000000018 R11: 0000000000000000 R12: 00000000014000c0
[896674.290488] R13: ffffffff9e2a8d26 R14: ffff92d603fc8ba0 R15: ffff92dd8b0037c0
[896674.297807] FS:  0000000000000000(0000) GS:ffff92ddbfd40000(0000) knlGS:0000000000000000
[896674.306082] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[896674.312014] CR2: 0000000000000018 CR3: 00000006c8c6d000 CR4: 00000000001406e0
[896674.319335] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[896674.326665] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[896674.333985] Call Trace:
[896674.336624]  ? kthread_create_on_node+0x60/0x60
[896674.341340]  kthread+0x46/0x140
[896674.344673]  ? taskq_cancel_id+0x130/0x130 [spl]
[896674.349475]  ? kthread_create_on_node+0x60/0x60
[896674.354194]  ret_from_fork+0x2c/0x40
[896674.357957] Code: 08 65 4c 03 05 77 e3 be 61 49 83 78 10 00 4d 8b 10 0f 84 f0 00 00 00 4d 85 d2 0f 84 e7 00 00 00 49 63 41 20 48 8d 4a 01 49 8b 39 <49> 8b 1c 02 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 49 63
[896674.377023] RIP: kmem_cache_alloc_trace+0x7b/0x1c0 RSP: ffffbb450f683eb8
[896674.383910] CR2: 0000000000000018
[896674.387412] ---[ end trace 4de68d0227766243 ]---

[896678.440616] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[896678.448691] IP: __kmalloc+0xbc/0x200
[896678.452453] PGD 0
[896678.452454]
[896678.456322] Oops: 0000 [#3] SMP
[896678.459646] Modules linked in: ufs msdos xfs tcp_diag inet_diag binfmt_misc zfs(POE) zunicode(POE) zavl(PO) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev input_leds parport_pc parport pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse nvme virtio_net nvme_core virtio_scsi
[896678.517434] CPU: 5 PID: 889 Comm: z_wr_iss Tainted: P      D    OE   4.10.0-40-generic #44~16.04.1-Ubuntu
[896678.527186] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[896678.536591] task: ffff92dd6014d7c0 task.stack: ffffbb450f4a4000
[896678.542701] RIP: 0010:__kmalloc+0xbc/0x200
[896678.546984] RSP: 0018:ffffbb450f4a7be8 EFLAGS: 00010206
[896678.552393] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000090eebb2a
[896678.559712] RDX: 0000000090eebb29 RSI: 0000000000000000 RDI: 000000000001c6a0
[896678.567032] RBP: ffffbb450f4a7c20 R08: ffff92ddbfd5c6a0 R09: ffff92dd8b0037c0
[896678.574353] R10: 0000000000000018 R11: fffff2ae44c880a0 R12: 0000000001400200
[896678.581671] R13: 0000000000000060 R14: ffffffff9e65bc59 R15: ffff92dd8b0037c0
[896678.588989] FS:  0000000000000000(0000) GS:ffff92ddbfd40000(0000) knlGS:0000000000000000
[896678.597263] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[896678.603194] CR2: 0000000000000018 CR3: 000000097f609000 CR4: 00000000001406e0
[896678.610514] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[896678.617834] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[896678.625153] Call Trace:
[896678.627791]  sg_kmalloc+0x19/0x30
[896678.631294]  __sg_alloc_table+0xfd/0x160
[896678.635402]  ? sg_free_table+0x70/0x70
[896678.639337]  sg_alloc_table+0x22/0x90
[896678.643230]  abd_alloc+0x230/0x470 [zfs]
[896678.647354]  arc_hdr_alloc_pabd+0xe7/0xf0 [zfs]
[896678.652085]  arc_write_ready+0x135/0x2f0 [zfs]
[896678.657080]  ? pick_next_task_fair+0x108/0x4d0
[896678.661712]  ? mutex_lock+0x12/0x40
[896678.665447]  zio_ready+0x65/0x460 [zfs]
[896678.669475]  ? tsd_get_by_thread+0x2e/0x40 [spl]
[896678.674279]  ? taskq_member+0x18/0x30 [spl]
[896678.678683]  zio_execute+0x8a/0xe0 [zfs]
[896678.682793]  taskq_thread+0x260/0x460 [spl]
[896678.687162]  ? wake_up_q+0x70/0x70
[896678.690750]  kthread+0x109/0x140
[896678.694166]  ? taskq_cancel_id+0x130/0x130 [spl]
[896678.698971]  ? kthread_create_on_node+0x60/0x60
[896678.703688]  ret_from_fork+0x2c/0x40
[896678.707451] Code: 08 65 4c 03 05 56 cd be 61 49 83 78 10 00 4d 8b 10 0f 84 d5 00 00 00 4d 85 d2 0f 84 cc 00 00 00 49 63 41 20 48 8d 4a 01 49 8b 39 <49> 8b 1c 02 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 49 63
[896678.726517] RIP: __kmalloc+0xbc/0x200 RSP: ffffbb450f4a7be8
[896678.732273] CR2: 0000000000000018
[896678.735774] ---[ end trace 4de68d0227766244 ]---

[896683.127883] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[896683.135948] IP: kmem_cache_alloc_node_trace+0xd7/0x1d0
[896683.141267] PGD 91ddb7067
[896683.141268] PUD b7aa63067
[896683.144157] PMD 0
[896683.147045]
[896683.150912] Oops: 0000 [#4] SMP
[896683.154236] Modules linked in: ufs msdos xfs tcp_diag inet_diag binfmt_misc zfs(POE) zunicode(POE) zavl(PO) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev input_leds parport_pc parport pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse nvme virtio_net nvme_core virtio_scsi
[896683.212018] CPU: 5 PID: 17027 Comm: mysqld Tainted: P      D    OE   4.10.0-40-generic #44~16.04.1-Ubuntu
[896683.221770] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[896683.231172] task: ffff92dcc9abd7c0 task.stack: ffffbb452dd00000
[896683.237283] RIP: 0010:kmem_cache_alloc_node_trace+0xd7/0x1d0
[896683.243124] RSP: 0018:ffffbb452dd03c28 EFLAGS: 00010246
[896683.248534] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000090eebb2a
[896683.255852] RDX: 0000000090eebb29 RSI: 00000000014000c0 RDI: 000000000001c6a0
[896683.263170] RBP: ffffbb452dd03c70 R08: ffff92ddbfd5c6a0 R09: ffff92dd8b0037c0
[896683.270487] R10: 0000000000000018 R11: ffff92dcc9abd7c0 R12: 00000000014000c0
[896683.277805] R13: 00000000ffffffff R14: ffffffff9e3f9d2d R15: ffff92dd8b0037c0
[896683.285123] FS:  00007f8398596740(0000) GS:ffff92ddbfd40000(0000) knlGS:0000000000000000
[896683.293404] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[896683.299333] CR2: 0000000000000018 CR3: 000000063100a000 CR4: 00000000001406e0
[896683.306659] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[896683.313976] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[896683.321294] Call Trace:
[896683.323932]  alloc_vmap_area+0x8d/0x380
[896683.327958]  __get_vm_area_node+0xb4/0x140
[896683.332237]  __vmalloc_node_range+0x73/0x280
[896683.336691]  ? _do_fork+0xe7/0x3f0
[896683.340277]  ? copy_process.part.34+0x11f/0x1c20
[896683.345089]  copy_process.part.34+0x61b/0x1c20
[896683.349715]  ? _do_fork+0xe7/0x3f0
[896683.353298]  ? dput+0x34/0x250
[896683.356536]  _do_fork+0xe7/0x3f0
[896683.359953]  ? ____fput+0xe/0x10
[896683.363369]  ? task_work_run+0x83/0xa0
[896683.367304]  SyS_clone+0x19/0x20
[896683.370717]  do_syscall_64+0x5b/0xc0
[896683.374478]  entry_SYSCALL64_slow_path+0x25/0x25
[896683.379281] RIP: 0033:0x7f83960e03a1
[896683.383043] RSP: 002b:00007ffd47e8b288 EFLAGS: 00000202 ORIG_RAX: 0000000000000038
[896683.390798] RAX: ffffffffffffffda RBX: 00007f7fcc73f700 RCX: 00007f83960e03a1
[896683.398119] RDX: 00007f7fcc73f9d0 RSI: 00007f7fcc73efb0 RDI: 00000000003d0f00
[896683.405445] RBP: 0000000001da4a40 R08: 00007f7fcc73f700 R09: 00007f7fcc73f700
[896683.412776] R10: 00007f7fcc73f9d0 R11: 0000000000000202 R12: 0000000000000000
[896683.420106] R13: 00007ffd47e8b33f R14: 0000000000041000 R15: 00000000041ff5b0
[896683.427437] Code: 49 63 51 1c 4c 89 d7 31 f6 4c 89 4d c0 4c 89 55 c8 e8 ae be 23 00 4c 8b 4d c0 4c 8b 55 c8 eb 33 49 63 41 20 48 8d 4a 01 49 8b 39 <49> 8b 1c 02 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84 61 ff
[896683.446529] RIP: kmem_cache_alloc_node_trace+0xd7/0x1d0 RSP: ffffbb452dd03c28
[896683.453863] CR2: 0000000000000018
[896683.457480] ---[ end trace 4de68d0227766245 ]---

[896686.675733] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[896686.683803] IP: __kmalloc+0xbc/0x200
[896686.687566] PGD b1f993067
[896686.687567] PUD 6d76be067
[896686.690460] PMD 0
[896686.693351]
[896686.697226] Oops: 0000 [#5] SMP
[896686.700552] Modules linked in: ufs msdos xfs tcp_diag inet_diag binfmt_misc zfs(POE) zunicode(POE) zavl(PO) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev input_leds parport_pc parport pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse nvme virtio_net nvme_core virtio_scsi
[896686.758339] CPU: 5 PID: 1792 Comm: thread.rb:70 Tainted: P      D    OE   4.10.0-40-generic #44~16.04.1-Ubuntu
[896686.768531] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[896686.777935] task: ffff92dc81210000 task.stack: ffffbb45116fc000
[896686.784043] RIP: 0010:__kmalloc+0xbc/0x200
[896686.788323] RSP: 0018:ffffbb45116ffbe0 EFLAGS: 00010206
[896686.793732] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000090eebb2a
[896686.801053] RDX: 0000000090eebb29 RSI: 0000000000000000 RDI: 000000000001c6a0
[896686.808382] RBP: ffffbb45116ffc18 R08: ffff92ddbfd5c6a0 R09: ffff92dd8b0037c0
[896686.815705] R10: 0000000000000018 R11: 0000000074656e64 R12: 00000000014080c0
[896686.823035] R13: 0000000000000044 R14: ffffffff9e4d407e R15: ffff92dd8b0037c0
[896686.830614] FS:  00007fd8677ff700(0000) GS:ffff92ddbfd40000(0000) knlGS:0000000000000000
[896686.838886] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[896686.844815] CR2: 0000000000000018 CR3: 00000006c8c6d000 CR4: 00000000001406e0
[896686.852139] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[896686.859458] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[896686.866778] Call Trace:
[896686.869414]  ? ext4fs_dirhash+0xc2/0x2b0
[896686.873525]  ext4_htree_store_dirent+0x3e/0x120
[896686.878242]  htree_dirblock_to_tree+0xf3/0x290
[896686.882871]  ? dput+0x34/0x250
[896686.886110]  ext4_htree_fill_tree+0xb5/0x320
[896686.890566]  ? kmem_cache_alloc_trace+0xdb/0x1c0
[896686.895368]  ext4_readdir+0x701/0xa20
[896686.899218]  iterate_dir+0x172/0x1a0
[896686.902977]  SyS_getdents+0x99/0x120
[896686.906738]  ? fillonedir+0x100/0x100
[896686.910588]  entry_SYSCALL_64_fastpath+0x1e/0xad
[896686.915391] RIP: 0033:0x7fd8793258eb
[896686.919151] RSP: 002b:00007fd8677fc9c0 EFLAGS: 00000206 ORIG_RAX: 000000000000004e
[896686.926906] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007fd8793258eb
[896686.934228] RDX: 0000000000008000 RSI: 00007fd8674a5870 RDI: 0000000000000107
[896686.941545] RBP: 00007fd878e21000 R08: 0000000000000000 R09: 0000000000000001
[896686.948865] R10: 0000000000000000 R11: 0000000000000206 R12: 00007fd878e21010
[896686.956183] R13: 00000000000004ed R14: 00007fd8677ff630 R15: 0000000000000002
[896686.963503] Code: 08 65 4c 03 05 56 cd be 61 49 83 78 10 00 4d 8b 10 0f 84 d5 00 00 00 4d 85 d2 0f 84 cc 00 00 00 49 63 41 20 48 8d 4a 01 49 8b 39 <49> 8b 1c 02 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 49 63
[896686.982568] RIP: __kmalloc+0xbc/0x200 RSP: ffffbb45116ffbe0
[896686.988324] CR2: 0000000000000018
[896686.991913] ---[ end trace 4de68d0227766246 ]---

[896925.519989] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[896925.528065] IP: __kmalloc+0xbc/0x200
[896925.531828] PGD b1f993067
[896925.531830] PUD 6d76be067
[896925.534721] PMD 0
[896925.537613]
[896925.541492] Oops: 0000 [#6] SMP
[896925.544825] Modules linked in: ufs msdos xfs tcp_diag inet_diag binfmt_misc zfs(POE) zunicode(POE) zavl(PO) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev input_leds parport_pc parport pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse nvme virtio_net nvme_core virtio_scsi
[896925.602626] CPU: 5 PID: 649 Comm: jbd2/sda1-8 Tainted: P      D    OE   4.10.0-40-generic #44~16.04.1-Ubuntu
[896925.612639] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[896925.622579] task: ffff92dd7fd68000 task.stack: ffffbb4506f84000
[896925.628697] RIP: 0010:__kmalloc+0xbc/0x200
[896925.632984] RSP: 0018:ffffbb4506f878f8 EFLAGS: 00010206
[896925.638395] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000090eebb2a
[896925.645723] RDX: 0000000090eebb29 RSI: 0000000000000000 RDI: 000000000001c6a0
[896925.653044] RBP: ffffbb4506f87930 R08: ffff92ddbfd5c6a0 R09: ffff92dd8b0037c0
[896925.660365] R10: 0000000000000018 R11: 0000000000000000 R12: 0000000001408040
[896925.667686] R13: 0000000000000060 R14: ffffffff9e509751 R15: ffff92dd8b0037c0
[896925.675008] FS:  0000000000000000(0000) GS:ffff92ddbfd40000(0000) knlGS:0000000000000000
[896925.683285] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[896925.689215] CR2: 0000000000000018 CR3: 00000006c8c6d000 CR4: 00000000001406e0
[896925.696542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[896925.703870] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[896925.711196] Call Trace:
[896925.713839]  ext4_find_extent+0x1f1/0x2f0
[896925.718034]  ext4_ext_map_blocks+0x78/0x1c30
[896925.722492]  ? kvm_sched_clock_read+0x1e/0x30
[896925.728426]  ? pick_next_task_fair+0x35f/0x4d0
[896925.734446]  ? __switch_to+0x23c/0x530
[896925.739773]  ext4_map_blocks+0x40a/0x5f0
[896925.744492]  ? bit_wait+0x60/0x60
[896925.748864]  _ext4_get_block+0x92/0x100
[896925.754461]  ? __slab_free+0x9a/0x2d0
[896925.759700]  ext4_get_block+0x16/0x20
[896925.764610]  generic_block_bmap+0x4e/0x70
[896925.769069]  ? kmem_cache_free+0x1cd/0x1e0
[896925.773351]  ext4_bmap+0x7d/0xe0
[896925.776765]  bmap+0x1c/0x30
[896925.779744]  jbd2_journal_bmap+0x2b/0x80
[896925.783855]  jbd2_journal_next_log_block+0x6b/0x80
[896925.788831]  jbd2_journal_get_descriptor_buffer+0x38/0xe0
[896925.794416]  jbd2_journal_commit_transaction+0x9da/0x17f0
[896925.800004]  ? update_curr+0xf3/0x180
[896925.803852]  ? dequeue_task_fair+0x4ee/0xb20
[896925.808309]  ? try_to_del_timer_sync+0x5a/0x80
[896925.812940]  kjournald2+0xca/0x250
[896925.816531]  ? wake_atomic_t_function+0x60/0x60
[896925.821250]  kthread+0x109/0x140
[896925.824674]  ? commit_timeout+0x10/0x10
[896925.828744]  ? kthread_create_on_node+0x60/0x60
[896925.833468]  ret_from_fork+0x2c/0x40
[896925.837229] Code: 08 65 4c 03 05 56 cd be 61 49 83 78 10 00 4d 8b 10 0f 84 d5 00 00 00 4d 85 d2 0f 84 cc 00 00 00 49 63 41 20 48 8d 4a 01 49 8b 39 <49> 8b 1c 02 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 49 63
[896925.856296] RIP: __kmalloc+0xbc/0x200 RSP: ffffbb4506f878f8
[896925.862062] CR2: 0000000000000018
[896925.865561] ---[ end trace 4de68d0227766247 ]---

[897265.894190] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[897265.902259] IP: kmem_cache_alloc_node_trace+0xd7/0x1d0
[897265.907582] PGD cf2fad067
[897265.907583] PUD cf32f5067
[897265.910474] PMD 0
[897265.913369]
[897265.917249] Oops: 0000 [#7] SMP
[897265.920616] Modules linked in: ufs msdos xfs tcp_diag inet_diag binfmt_misc zfs(POE) zunicode(POE) zavl(PO) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev input_leds parport_pc parport pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse nvme virtio_net nvme_core virtio_scsi
[897265.978416] CPU: 5 PID: 1982 Comm: google_accounts Tainted: P      D    OE   4.10.0-40-generic #44~16.04.1-Ubuntu
[897265.988882] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[897265.998286] task: ffff92dd6b85d7c0 task.stack: ffffbb45077d8000
[897266.004399] RIP: 0010:kmem_cache_alloc_node_trace+0xd7/0x1d0
[897266.010244] RSP: 0018:ffffbb45077dbc28 EFLAGS: 00010246
[897266.015656] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000090eebb2a
[897266.022980] RDX: 0000000090eebb29 RSI: 00000000014000c0 RDI: 000000000001c6a0
[897266.030304] RBP: ffffbb45077dbc70 R08: ffff92ddbfd5c6a0 R09: ffff92dd8b0037c0
[897266.037628] R10: 0000000000000018 R11: ffff92dd6b85d7c0 R12: 00000000014000c0
[897266.044947] R13: 00000000ffffffff R14: ffffffff9e3f9d2d R15: ffff92dd8b0037c0
[897266.052272] FS:  00007faa3c321700(0000) GS:ffff92ddbfd40000(0000) knlGS:0000000000000000
[897266.060544] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[897266.066475] CR2: 0000000000000018 CR3: 0000000cf2d8d000 CR4: 00000000001406e0
[897266.073797] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[897266.081115] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[897266.088433] Call Trace:
[897266.091084]  alloc_vmap_area+0x8d/0x380
[897266.095110]  __get_vm_area_node+0xb4/0x140
[897266.099397]  __vmalloc_node_range+0x73/0x280
[897266.103865]  ? _do_fork+0xe7/0x3f0
[897266.107454]  ? copy_process.part.34+0x11f/0x1c20
[897266.112256]  copy_process.part.34+0x61b/0x1c20
[897266.116886]  ? _do_fork+0xe7/0x3f0
[897266.120476]  ? do_wp_page+0x109/0x5d0
[897266.124322]  ? handle_mm_fault+0x86b/0x1270
[897266.128694]  ? kmem_cache_alloc+0xd7/0x1b0
[897266.132978]  _do_fork+0xe7/0x3f0
[897266.136396]  ? __do_page_fault+0x265/0x4e0
[897266.140675]  SyS_clone+0x19/0x20
[897266.144113]  do_syscall_64+0x5b/0xc0
[897266.147897]  entry_SYSCALL64_slow_path+0x25/0x25
[897266.152700] RIP: 0033:0x7faa3bbed41a
[897266.156486] RSP: 002b:00007ffcadee05b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[897266.164246] RAX: ffffffffffffffda RBX: 00007ffcadee05b0 RCX: 00007faa3bbed41a
[897266.171571] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[897266.178893] RBP: 00007ffcadee0600 R08: 00000000000007be R09: 00007faa3c321700
[897266.186212] R10: 00007faa3c3219d0 R11: 0000000000000246 R12: 00000000000007be
[897266.193531] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[897266.200851] Code: 49 63 51 1c 4c 89 d7 31 f6 4c 89 4d c0 4c 89 55 c8 e8 ae be 23 00 4c 8b 4d c0 4c 8b 55 c8 eb 33 49 63 41 20 48 8d 4a 01 49 8b 39 <49> 8b 1c 02 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84 61 ff
[897266.219914] RIP: kmem_cache_alloc_node_trace+0xd7/0x1d0 RSP: ffffbb45077dbc28
[897266.227244] CR2: 0000000000000018
[897266.230844] ---[ end trace 4de68d0227766248 ]---

[897266.237052] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[897266.245200] IP: kmem_cache_alloc_trace+0x7b/0x1c0
[897266.250094] PGD 0
[897266.250095]
[897266.253978] Oops: 0000 [#8] SMP
[897266.257310] Modules linked in: ufs msdos xfs tcp_diag inet_diag binfmt_misc zfs(POE) zunicode(POE) zavl(PO) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev input_leds parport_pc parport pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse nvme virtio_net nvme_core virtio_scsi
[897266.316492] CPU: 5 PID: 4102 Comm: kworker/5:2 Tainted: P      D    OE   4.10.0-40-generic #44~16.04.1-Ubuntu
[897266.326596] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[897266.336013] Workqueue: events cgroup_release_agent
[897266.341002] task: ffff92d1282aba80 task.stack: ffffbb451b728000
[897266.348506] RIP: 0010:kmem_cache_alloc_trace+0x7b/0x1c0
[897266.353922] RSP: 0018:ffffbb451b72bd80 EFLAGS: 00010206
[897266.360638] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000090eebb2a
[897266.367976] RDX: 0000000090eebb29 RSI: 00000000014080c0 RDI: 000000000001c6a0
[897266.375305] RBP: ffffbb451b72bdc0 R08: ffff92ddbfd5c6a0 R09: ffff92dd8b0037c0
[897266.384011] R10: 0000000000000018 R11: 000000000001d100 R12: 00000000014080c0
[897266.391334] R13: ffffffff9e29f05e R14: 0000000000000001 R15: ffff92dd8b0037c0
[897266.398655] FS:  0000000000000000(0000) GS:ffff92ddbfd40000(0000) knlGS:0000000000000000
[897266.406932] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[897266.412865] CR2: 0000000000000018 CR3: 000000097f609000 CR4: 00000000001406e0
[897266.420189] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[897266.427511] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[897266.434834] Call Trace:
[897266.437478]  call_usermodehelper+0x3e/0xb0
[897266.441771]  cgroup_release_agent+0x136/0x140
[897266.447703]  process_one_work+0x16b/0x4a0
[897266.451903]  worker_thread+0x4b/0x500
[897266.457057]  kthread+0x109/0x140
[897266.460473]  ? process_one_work+0x4a0/0x4a0
[897266.464845]  ? kthread_create_on_node+0x60/0x60
[897266.469567]  ret_from_fork+0x2c/0x40
[897266.473331] Code: 08 65 4c 03 05 77 e3 be 61 49 83 78 10 00 4d 8b 10 0f 84 f0 00 00 00 4d 85 d2 0f 84 e7 00 00 00 49 63 41 20 48 8d 4a 01 49 8b 39 <49> 8b 1c 02 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 49 63
[897266.493776] RIP: kmem_cache_alloc_trace+0x7b/0x1c0 RSP: ffffbb451b72bd80
[897266.500665] CR2: 0000000000000018
[897266.505472] ---[ end trace 4de68d0227766249 ]---

[897570.084437] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[897570.092505] IP: __kmalloc+0xbc/0x200
[897570.096266] PGD 0
[897570.096266]
[897570.100141] Oops: 0000 [#9] SMP
[897570.103467] Modules linked in: ufs msdos xfs tcp_diag inet_diag binfmt_misc zfs(POE) zunicode(POE) zavl(PO) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev input_leds parport_pc parport pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse nvme virtio_net nvme_core virtio_scsi
[897570.161259] CPU: 5 PID: 25246 Comm: release-upgrade Tainted: P      D    OE   4.10.0-40-generic #44~16.04.1-Ubuntu
[897570.171790] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[897570.181190] task: ffff92dd746fd7c0 task.stack: ffffbb45100b4000
[897570.187295] RIP: 0010:__kmalloc+0xbc/0x200
[897570.191574] RSP: 0018:ffffbb45100b77f8 EFLAGS: 00010206
[897570.196988] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000090eebb2a
[897570.204308] RDX: 0000000090eebb29 RSI: 0000000000000000 RDI: 000000000001c6a0
[897570.211626] RBP: ffffbb45100b7830 R08: ffff92ddbfd5c6a0 R09: ffff92dd8b0037c0
[897570.218943] R10: 0000000000000018 R11: 0000000000000040 R12: 0000000001408040
[897570.226265] R13: 0000000000000060 R14: ffffffff9e509751 R15: ffff92dd8b0037c0
[897570.233584] FS:  00007fa86cd10700(0000) GS:ffff92ddbfd40000(0000) knlGS:0000000000000000
[897570.241856] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[897570.247786] CR2: 0000000000000018 CR3: 0000000b39b53000 CR4: 00000000001406e0
[897570.255104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[897570.262420] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[897570.269737] Call Trace:
[897570.272376]  ext4_find_extent+0x1f1/0x2f0
[897570.276571]  ext4_ext_map_blocks+0x78/0x1c30
[897570.281027]  ? count_shadow_nodes+0xa0/0xa0
[897570.285393]  ? __radix_tree_replace+0xf7/0x100
[897570.290019]  ? mem_cgroup_commit_charge+0x7e/0x4d0
[897570.295000]  ? page_cache_tree_insert+0xad/0x110
[897570.299803]  ext4_map_blocks+0x40a/0x5f0
[897570.303913]  ext4_mpage_readpages+0x365/0x9b0
[897570.308453]  ext4_readpages+0x36/0x40
[897570.312302]  __do_page_cache_readahead+0x19a/0x270
[897570.317277]  ondemand_readahead+0x178/0x2a0
[897570.321645]  page_cache_sync_readahead+0x2e/0x50
[897570.326447]  generic_file_read_iter+0x6ab/0x900
[897570.331161]  ext4_file_read_iter+0x37/0xb0
[897570.335441]  new_sync_read+0xd0/0x120
[897570.339287]  __vfs_read+0x26/0x40
[897570.342785]  vfs_read+0x93/0x130
[897570.346199]  prepare_binprm+0x10c/0x1f0
[897570.350218]  do_execveat_common.isra.39+0x4b4/0x7a0
[897570.355277]  SyS_execve+0x3a/0x50
[897570.358779]  do_syscall_64+0x5b/0xc0
[897570.362540]  entry_SYSCALL64_slow_path+0x25/0x25
[897570.367340] RIP: 0033:0x7fa86c7f8777
[897570.371096] RSP: 002b:00007fff21470028 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
[897570.378848] RAX: ffffffffffffffda RBX: 00005630d473db50 RCX: 00007fa86c7f8777
[897570.386162] RDX: 00005630d473db38 RSI: 00005630d473db10 RDI: 00005630d473db50
[897570.393478] RBP: 00005630d473db10 R08: 000000000000000e R09: 0000000000000001
[897570.400794] R10: 0000000000000001 R11: 0000000000000246 R12: 00005630d473db38
[897570.408109] R13: 0000000000000008 R14: 0400002000000001 R15: 0000000000000001
[897570.415425] Code: 08 65 4c 03 05 56 cd be 61 49 83 78 10 00 4d 8b 10 0f 84 d5 00 00 00 4d 85 d2 0f 84 cc 00 00 00 49 63 41 20 48 8d 4a 01 49 8b 39 <49> 8b 1c 02 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 49 63
[897570.434481] RIP: __kmalloc+0xbc/0x200 RSP: ffffbb45100b77f8
[897570.440233] CR2: 0000000000000018
[897570.443822] ---[ end trace 4de68d022776624a ]---

[897570.504707] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[897570.513683] IP: __kmalloc+0xbc/0x200
[897570.517447] PGD 0
[897570.517448]
[897570.521326] Oops: 0000 [#10] SMP
[897570.524742] Modules linked in: ufs msdos xfs tcp_diag inet_diag binfmt_misc zfs(POE) zunicode(POE) zavl(PO) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables ppdev input_leds parport_pc parport pvpanic serio_raw ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd psmouse nvme virtio_net nvme_core virtio_scsi
[897570.583884] CPU: 5 PID: 25259 Comm: update-motd-fsc Tainted: P      D    OE   4.10.0-40-generic #44~16.04.1-Ubuntu
[897570.594419] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
[897570.603825] task: ffff92dad81d57c0 task.stack: ffffbb45100e4000
[897570.609933] RIP: 0010:__kmalloc+0xbc/0x200
[897570.614225] RSP: 0018:ffffbb45100e77f8 EFLAGS: 00010206
[897570.620941] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000090eebb2a
[897570.628274] RDX: 0000000090eebb29 RSI: 0000000000000000 RDI: 000000000001c6a0
[897570.635599] RBP: ffffbb45100e7830 R08: ffff92ddbfd5c6a0 R09: ffff92dd8b0037c0
[897570.644308] R10: 0000000000000018 R11: 0000000000000000 R12: 0000000001408040
[897570.651717] R13: 0000000000000060 R14: ffffffff9e509751 R15: ffff92dd8b0037c0
[897570.660347] FS:  00007f6a9f2a2700(0000) GS:ffff92ddbfd40000(0000) knlGS:0000000000000000
[897570.668624] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[897570.675854] CR2: 0000000000000018 CR3: 0000000bb23d0000 CR4: 00000000001406e0
[897570.683177] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[897570.690498] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[897570.699117] Call Trace:
[897570.701756]  ext4_find_extent+0x1f1/0x2f0
[897570.705950]  ext4_ext_map_blocks+0x78/0x1c30
[897570.711822]  ? list_lru_del+0x59/0x120
[897570.717145]  ? __radix_tree_replace+0x4e/0x100
[897570.721781]  ? mem_cgroup_commit_charge+0x7e/0x4d0
[897570.726766]  ? page_cache_tree_insert+0xad/0x110
[897570.731580]  ext4_map_blocks+0x40a/0x5f0
[897570.735701]  ext4_mpage_readpages+0x365/0x9b0
[897570.741631]  ext4_readpages+0x36/0x40
[897570.745486]  __do_page_cache_readahead+0x19a/0x270
[897570.750465]  ondemand_readahead+0x178/0x2a0
[897570.754839]  page_cache_sync_readahead+0x2e/0x50
[897570.759644]  generic_file_read_iter+0x6ab/0x900
[897570.764362]  ext4_file_read_iter+0x37/0xb0
[897570.768646]  new_sync_read+0xd0/0x120
[897570.772498]  __vfs_read+0x26/0x40
[897570.776016]  vfs_read+0x93/0x130
[897570.779442]  prepare_binprm+0x10c/0x1f0
[897570.784854]  do_execveat_common.isra.39+0x4b4/0x7a0
[897570.789924]  SyS_execve+0x3a/0x50
[897570.794814]  do_syscall_64+0x5b/0xc0
[897570.798581]  entry_SYSCALL64_slow_path+0x25/0x25
[897570.803388] RIP: 0033:0x7f6a9ed8a777
[897570.807153] RSP: 002b:00007ffd8daeebc8 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
[897570.814910] RAX: ffffffffffffffda RBX: 00005577a5c9be98 RCX: 00007f6a9ed8a777
[897570.822244] RDX: 00005577a5c9be80 RSI: 00005577a5c9be40 RDI: 00005577a5c9be98
[897570.829567] RBP: 00005577a5c9be40 R08: 000000000000000f R09: 0000000000000001
[897570.836891] R10: 0000000000000001 R11: 0000000000000246 R12: 00005577a5c9be80
[897570.844213] R13: 0000000000000005 R14: 0400002000000001 R15: 0000000000000001
[897570.851539] Code: 08 65 4c 03 05 56 cd be 61 49 83 78 10 00 4d 8b 10 0f 84 d5 00 00 00 4d 85 d2 0f 84 cc 00 00 00 49 63 41 20 48 8d 4a 01 49 8b 39 <49> 8b 1c 02 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 49 63
[897570.871997] RIP: __kmalloc+0xbc/0x200 RSP: ffffbb45100e77f8
[897570.877759] CR2: 0000000000000018
[897570.881328] ---[ end trace 4de68d022776624b ]---
behlendorf commented 6 years ago

@twarberg thanks for reporting this issue. Based on the stacks you provided it appears that somehow one of the global kmalloc-* slabs (as shown by slabtop) has been damaged. This resulted in the BUG which both zfs and ext4 encountered as well as several other processes on the system.

Unfortunately anything running in kernel space could have potentially caused this issue and we don't have any clear evidence. All we know is that the l2arc was the first victim. After inspecting the l2arc code I don't see how it could have caused this, but since you have good suspicions let me suggest two things you could try to help us narrow this down.

1) If possible disable the l2arc entirely in your environment to rule it out as a cause, or 2) Continue to use the l2arc and set the spl_kmem_cache_slab_limit module option to 0. This will move the memory allocations from the kernel slab implementation to the spl slab implementation which should provide us some additional debug information if the issue repeats.

twarberg commented 6 years ago

@behlendorf Not sure what you mean by disabling l2arc. If I don't have the NVMe device attached as a cache on the pool[1] theres no issues (Sorry if I was unclear on that). Unless I misunderstand, that covers no. 1?

I'll try out no. 2 but as described I probably won't have anything to report for 10 days or so.

Clarification on my suspicion With the postgres server it crashed 3 times with approx. 1 week between. First time was one week after the server was provisioned. Second time Google had replaced the boot kernel to a newer build version and I updated to 0.7.2 hoping that would fix it. Between 2nd and 3rd time I had read a issue here that seemed similar to what I've seen that suggested it was L2arc related and a fix was coming in 0.7.3 and in the meanwhile disabling L2arc should do it and when it happened the 3rd time I removed the cache drive and haven't had any issues since then (It's still on 0.7.2). Then decided to give it a new try on 0.7.3 when building a new MySQL server and after 10 days the crash above happened. This time I made sure i copied the console output above.

[1] The NVMe is still attached to both instances because it can't be removed when the instance is running.

behlendorf commented 6 years ago

@twarberg thanks for the clarification, definitely let us know if you observe it again.

twarberg commented 6 years ago

Been running for 24 days on 0.7.4 and spl_kmem_cache_slab_limit=0. Will upgrade to 0.7.5 and without spl_kmem_cache_slab_limit=0 to verify if it was 0.7.4 or spl_kmem_cache_slab_limit that did the change

adam-dej commented 6 years ago

I believe I'm also affected by this issue. My system information:

Type Version/Name
Distribution Name CentOS
Distribution Version 7.4
Linux Kernel 3.10.0-693.11.6.el7.x86_64
Architecture x86_64
ZFS Version 0.7.5-1
SPL Version 0.7.5-1

I have a NVMe SSD as L2ARC in my pool.

The issue manifests itself after 4 to 6 hours of running pgbench (in my case I was benchmarking storage options for KVM VMs) in On-disk heavy-contention mode. I have been able to reproduce this crash several times, each time it occured within 6 hours.

dmesg output:

[39351.772947] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[39351.773005] IP: [<ffffffff811e14d4>] kmem_cache_alloc+0x74/0x1e0
[39351.773036] PGD 0 
[39351.773048] Oops: 0000 [#1] SMP 
[39351.773065] Modules linked in: cfg80211 rfkill xt_nat veth vhost_net vhost macvtap macvlan xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink tun xt_addrtype br_netfilter overlay(T) target_core_user uio target_core_pscsi target_core_file target_core_iblock nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi
[39351.773437]  ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core osst st ch ses enclosure zfs(POE) zunicode(POE) zavl(POE) icp(POE) intel_powerclamp coretemp kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd zcommon(POE) znvpair(POE) iTCO_wdt spl(OE) mpt2sas iTCO_vendor_support gpio_ich raid_class scsi_transport_sas sg i2c_i801 pcspkr i7core_edac shpchp ioatdma lpc_ich edac_core acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci mlx4_core drm libahci igb libata crct10dif_pclmul crct10dif_common crc32c_intel
[39351.773857]  serio_raw ptp nvme pps_core dca nvme_core i2c_algo_bit devlink i2c_core
[39351.773919] CPU: 22 PID: 767 Comm: arc_reclaim Tainted: P           OE  ------------ T 3.10.0-693.11.6.el7.x86_64 #1
[39351.773962] Hardware name: Supermicro X8DT3/X8DT3, BIOS 2.1     03/17/2012
[39351.773988] task: ffff88060e563f40 ti: ffff8800b8f64000 task.ti: ffff8800b8f64000
[39351.774015] RIP: 0010:[<ffffffff811e14d4>]  [<ffffffff811e14d4>] kmem_cache_alloc+0x74/0x1e0
[39351.774049] RSP: 0018:ffff8800b8f67c50  EFLAGS: 00010282
[39351.774069] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000e46a7bf
[39351.774095] RDX: 000000000e46a7be RSI: 0000000000004230 RDI: ffff88017fc03a00
[39351.774122] RBP: ffff8800b8f67c80 R08: 000000000001b920 R09: ffffffffc0553319
[39351.774148] R10: 0000000034b27301 R11: ffffea0014d2c980 R12: 0000000000000008
[39351.774174] R13: 0000000000004230 R14: ffff88017fc03a00 R15: ffff88017fc03a00
[39351.774200] FS:  0000000000000000(0000) GS:ffff880627580000(0000) knlGS:0000000000000000
[39351.774230] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[39351.774251] CR2: 0000000000000008 CR3: 0000000334814000 CR4: 00000000000227e0
[39351.774278] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[39351.774313] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[39351.774340] Call Trace:
[39351.774368]  [<ffffffffc0553319>] ? spl_kmem_cache_alloc+0x99/0x150 [spl]
[39351.774398]  [<ffffffffc0553319>] spl_kmem_cache_alloc+0x99/0x150 [spl]
[39351.774473]  [<ffffffffc07f8b21>] arc_hdr_realloc+0x31/0x260 [zfs]
[39351.774532]  [<ffffffffc07fdb76>] arc_evict_state+0x546/0x880 [zfs]
[39351.774556]  [<ffffffffc07fdf57>] arc_adjust_impl.constprop.33+0x37/0x50 [zfs]
[39351.774572]  [<ffffffffc07fe1ab>] arc_adjust+0x23b/0x4a0 [zfs]
[39351.774588]  [<ffffffffc07ff17d>] arc_reclaim_thread+0xad/0x290 [zfs]
[39351.774603]  [<ffffffffc07ff0d0>] ? arc_shrink+0xc0/0xc0 [zfs]
[39351.774610]  [<ffffffffc0553fa1>] thread_generic_wrapper+0x71/0x80 [spl]
[39351.774615]  [<ffffffffc0553f30>] ? __thread_exit+0x20/0x20 [spl]
[39351.774619]  [<ffffffff810b252f>] kthread+0xcf/0xe0
[39351.774620]  [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[39351.774623]  [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[39351.774624]  [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[39351.774635] Code: fc e2 7e 49 8b 50 08 4d 8b 20 49 8b 40 10 4d 85 e4 0f 84 20 01 00 00 48 85 c0 0f 84 17 01 00 00 49 63 46 20 48 8d 4a 01 4d 8b 06 <49> 8b 1c 04 4c 89 e0 65 49 0f c7 08 0f 94 c0 84 c0 74 ba 49 63 
[39351.774636] RIP  [<ffffffff811e14d4>] kmem_cache_alloc+0x74/0x1e0
[39351.774637]  RSP <ffff8800b8f67c50>
[39351.774637] CR2: 0000000000000008

After setting spl_kmem_cache_slab_limit option to 0, the system was stable and the benchmark ran for at least 48 hours.

However, after some more time the system become unresponsive. I'm not sure whether it is related to this issue or not:

Stack trace of possibly unrelated issue

```
Jan 14 10:57:56.741418  kernel: ------------[ cut here ]------------
Jan 14 10:57:56.778146  kernel: WARNING: CPU: 2 PID: 1046 at lib/list_debug.c:36 __list_add+0x8a/0xc0
Jan 14 10:57:56.780472  kernel: list_add double add: new=ffff880149d1e1c8, prev=ffff880149d1e020, next=ffff880149d1e1c8.
Jan 14 10:57:56.780543  kernel: Modules linked in: xt_nat veth vhost_net vhost macvtap macvlan xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netlink xt_addrtype br_netfilter overlay(T) target_core_user uio target_core_pscsi target_core_file target_core_iblock nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi
Jan 14 10:57:56.788423  kernel:  ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core osst zfs(POE) zunicode(POE) zavl(POE) icp(POE) intel_powerclamp coretemp kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support gpio_ich zcommon(POE) znvpair(POE) ses enclosure spl(OE) st ch ioatdma pcspkr i2c_i801 sg lpc_ich i7core_edac edac_core shpchp acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mlx4_en mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlx4_core ahci libahci drm igb mpt2sas libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw ptp pps_core
Jan 14 10:57:56.791461  kernel:  nvme raid_class dca nvme_core i2c_algo_bit scsi_transport_sas devlink i2c_core
Jan 14 10:57:56.791614  kernel: CPU: 2 PID: 1046 Comm: z_wr_iss Tainted: P           OE  ------------ T 3.10.0-693.11.6.el7.x86_64 #1
Jan 14 10:57:56.794159  kernel: Hardware name: Supermicro X8DT3/X8DT3, BIOS 2.1     03/17/2012
Jan 14 10:57:56.796269  kernel: Call Trace:
Jan 14 10:57:56.800152  kernel:  [] dump_stack+0x19/0x1b
Jan 14 10:57:56.800214  kernel:  [] __warn+0xd8/0x100
Jan 14 10:57:56.800275  kernel:  [] warn_slowpath_fmt+0x5f/0x80
Jan 14 10:57:56.803681  kernel:  [] ? sched_clock+0x9/0x10
Jan 14 10:57:56.805661  kernel:  [] __list_add+0x8a/0xc0
Jan 14 10:57:56.807628  kernel:  [] __spl_cache_flush+0xb5/0x150 [spl]
Jan 14 10:57:56.810723  kernel:  [] spl_cache_flush+0x36/0x50 [spl]
Jan 14 10:57:56.810784  kernel:  [] spl_kmem_cache_free+0x1c5/0x1e0 [spl]
Jan 14 10:57:56.810833  kernel:  [] arc_hdr_destroy+0x76/0x1c0 [zfs]
Jan 14 10:57:56.810885  kernel:  [] arc_freed+0x69/0xc0 [zfs]
Jan 14 10:57:56.810937  kernel:  [] zio_free_sync+0x45/0x140 [zfs]
Jan 14 10:57:56.812111  kernel:  [] ? dbuf_rele+0x36/0x40 [zfs]
Jan 14 10:57:56.812171  kernel:  [] ? dbuf_destroy+0x280/0x370 [zfs]
Jan 14 10:57:56.816312  kernel:  [] zio_free+0xab/0x110 [zfs]
Jan 14 10:57:56.818306  kernel:  [] dsl_free+0x11/0x20 [zfs]
Jan 14 10:57:56.818361  kernel:  [] dsl_dataset_block_kill+0x267/0x4c0 [zfs]
Jan 14 10:57:56.818429  kernel:  [] dbuf_write_done+0x15a/0x1a0 [zfs]
Jan 14 10:57:56.818491  kernel:  [] arc_write_done+0xa1/0x400 [zfs]
Jan 14 10:57:56.821424  kernel:  [] zio_done+0x321/0xcf0 [zfs]
Jan 14 10:57:56.824922  kernel:  [] ? mutex_unlock+0x1b/0x20
Jan 14 10:57:56.826754  kernel:  [] ? zio_ready+0x237/0x3f0 [zfs]
Jan 14 10:57:56.826813  kernel:  [] ? zio_write_compress+0x33e/0x6a0 [zfs]
Jan 14 10:57:56.826878  kernel:  [] zio_execute+0x9c/0x100 [zfs]
Jan 14 10:57:56.826935  kernel:  [] taskq_thread+0x2a7/0x4f0 [spl]
Jan 14 10:57:56.826993  kernel:  [] ? wake_up_state+0x20/0x20
Jan 14 10:57:56.828255  kernel:  [] ? zio_taskq_member.isra.7.constprop.10+0x80/0x80 [zfs]
Jan 14 10:57:56.828320  kernel:  [] ? taskq_thread_spawn+0x60/0x60 [spl]
Jan 14 10:57:56.828381  kernel:  [] kthread+0xcf/0xe0
Jan 14 10:57:56.828432  kernel:  [] ? insert_kthread_work+0x40/0x40
Jan 14 10:57:56.828491  kernel:  [] ret_from_fork+0x58/0x90
Jan 14 10:57:56.828552  kernel:  [] ? insert_kthread_work+0x40/0x40
Jan 14 10:57:56.829594  kernel: ---[ end trace 0ab1c41144b59ce1 ]---
Jan 14 11:38:22.902202  kernel: NMI watchdog: BUG: soft lockup - CPU#8 stuck for 22s! [l2arc_feed:832]
Jan 14 11:38:22.914494  kernel: Modules linked in: xt_nat veth vhost_net vhost macvtap macvlan xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netlink xt_addrtype br_netfilter overlay(T) target_core_user uio target_core_pscsi target_core_file target_core_iblock nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi
Jan 14 11:38:22.921368  kernel:  ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core osst zfs(POE) zunicode(POE) zavl(POE) icp(POE) intel_powerclamp coretemp kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support gpio_ich zcommon(POE) znvpair(POE) ses enclosure spl(OE) st ch ioatdma pcspkr i2c_i801 sg lpc_ich i7core_edac edac_core shpchp acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mlx4_en mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlx4_core ahci libahci drm igb mpt2sas libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw ptp pps_core
Jan 14 11:38:22.922724  kernel:  nvme raid_class dca nvme_core i2c_algo_bit scsi_transport_sas devlink i2c_core
Jan 14 11:38:22.922774  kernel: CPU: 8 PID: 832 Comm: l2arc_feed Tainted: P        W  OE  ------------ T 3.10.0-693.11.6.el7.x86_64 #1
Jan 14 11:38:22.922826  kernel: Hardware name: Supermicro X8DT3/X8DT3, BIOS 2.1     03/17/2012
Jan 14 11:38:22.922876  kernel: task: ffff88062062af70 ti: ffff8806214b0000 task.ti: ffff8806214b0000
Jan 14 11:38:22.922922  kernel: RIP: 0010:[]  [] spl_slab_reclaim+0x17c/0x220 [spl]
Jan 14 11:38:22.922968  kernel: RSP: 0018:ffff8806214b3c48  EFLAGS: 00000246
Jan 14 11:38:22.923015  kernel: RAX: ffff880149d1e1c8 RBX: ffffffff816b9c6f RCX: ffff880149d1e1b0
Jan 14 11:38:22.923062  kernel: RDX: 0000000000000020 RSI: ffff880149d1e1b0 RDI: ffff880622f510b8
Jan 14 11:38:22.923106  kernel: RBP: ffff8806214b3ca8 R08: ffff880149d1f2c8 R09: 0000000000000000
Jan 14 11:38:22.923148  kernel: R10: ffff88060d5ca420 R11: ffff8801685a70b8 R12: ffffffff816b9bff
Jan 14 11:38:22.923213  kernel: R13: ffffffff816b9c06 R14: ffffffff816b9c0d R15: ffffffff816b9c14
Jan 14 11:38:22.923290  kernel: FS:  0000000000000000(0000) GS:ffff880627200000(0000) knlGS:0000000000000000
Jan 14 11:38:22.923367  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jan 14 11:38:22.925319  kernel: CR2: 0000000004452000 CR3: 00000000019fa000 CR4: 00000000000227e0
Jan 14 11:38:22.925366  kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 14 11:38:22.925432  kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jan 14 11:38:22.925505  kernel: Call Trace:
Jan 14 11:38:22.925566  kernel:  [] spl_kmem_cache_free+0x158/0x1e0 [spl]
Jan 14 11:38:22.925647  kernel:  [] arc_hdr_destroy+0x76/0x1c0 [zfs]
Jan 14 11:38:22.925722  kernel:  [] l2arc_evict+0x287/0x310 [zfs]
Jan 14 11:38:22.925786  kernel:  [] l2arc_feed_thread+0x3a4/0xb80 [zfs]
Jan 14 11:38:22.925840  kernel:  [] ? l2arc_evict+0x310/0x310 [zfs]
Jan 14 11:38:22.925891  kernel:  [] thread_generic_wrapper+0x71/0x80 [spl]
Jan 14 11:38:22.925942  kernel:  [] ? __thread_exit+0x20/0x20 [spl]
Jan 14 11:38:22.926022  kernel:  [] kthread+0xcf/0xe0
Jan 14 11:38:22.926103  kernel:  [] ? insert_kthread_work+0x40/0x40
Jan 14 11:38:22.926168  kernel:  [] ret_from_fork+0x58/0x90
Jan 14 11:38:22.926233  kernel:  [] ? insert_kthread_work+0x40/0x40
Jan 14 11:38:22.926291  kernel: Code: c1 02 48 d3 e0 49 89 c4 48 8b 4d c0 4c 8d 73 50 48 8b 01 4c 39 e9 48 8d 71 e8 4c 8d 78 e8 75 22 eb 3d 0f 1f 44 00 00 49 8b 47 18 <49> 8d 57 18 4c 39 ea 48 8d 48 e8 74 27 48 8b 53 50 4c 89 fe 49 
```

None of these happen if I remove the cache device from the pool. So I believe that the issue is still present in 0.7.5, and for me setting spl_kmem_cache_slab_limit to 0 makes the system more stable, but another (possibly unrelated) issue still manifests itself.

behlendorf commented 6 years ago

@twarberg @adam-dej would you mind posting the capacity of the cache device you were using and total system memory.

twarberg commented 6 years ago

@behlendorf Sure thing MySQL system: 52GB mem, 375GB cache device Postgres: system: 118GB mem, 375GB cache device

behlendorf commented 6 years ago

Thanks, you might try keeping an eye of the l2_hdr_size as reported in /proc/spl/kstat/zfs/arcstats. The larger the l2arc device the more memory it takes to manage it. It looks like your system has more than enough memory this won't be a problem but it is a thing to keep an eye on which might help explain this.

twarberg commented 6 years ago

Haven't seen any issues for months. Will close for now