Closed cdupont closed 7 years ago
Mongod keeps restarting. It appears that the volume /mnt/vol mounted in the master VM from SIRIS went in read only:
[203520.512206] INFO: task jbd2/vdb-8:618 blocked for more than 120 seconds. [203520.513329] Not tainted 4.4.0-79-generic #100-Ubuntu [203520.514302] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [203520.515506] jbd2/vdb-8 D ffff8802357afad8 0 618 2 0x00000000 [203520.515516] ffff8802357afad8 00000006343558c0 ffff880236238cc0 ffff880235700cc0 [203520.515528] ffff8802357b0000 ffff88023fd16dc0 7fffffffffffffff ffffffff8183d150 [203520.515530] ffff8802357afc30 ffff8802357afaf0 ffffffff8183c955 0000000000000000 [203520.515532] Call Trace: [203520.515563] [<ffffffff8183d150>] ? bit_wait+0x60/0x60 [203520.515565] [<ffffffff8183c955>] schedule+0x35/0x80 [203520.515567] [<ffffffff8183faa5>] schedule_timeout+0x1b5/0x270 [203520.515581] [<ffffffff8106428e>] ? kvm_clock_get_cycles+0x1e/0x20 [203520.515583] [<ffffffff8106428e>] ? kvm_clock_get_cycles+0x1e/0x20 [203520.515590] [<ffffffff810f625c>] ? ktime_get+0x3c/0xb0 [203520.515592] [<ffffffff8183d150>] ? bit_wait+0x60/0x60 [203520.515594] [<ffffffff8183be84>] io_schedule_timeout+0xa4/0x110 [203520.515596] [<ffffffff8183d16b>] bit_wait_io+0x1b/0x70 [203520.515598] [<ffffffff8183ccfd>] __wait_on_bit+0x5d/0x90 [203520.515599] [<ffffffff8183d150>] ? bit_wait+0x60/0x60 [203520.515601] [<ffffffff8183cdb2>] out_of_line_wait_on_bit+0x82/0xb0 [203520.515610] [<ffffffff810c4370>] ? autoremove_wake_function+0x40/0x40 [203520.515621] [<ffffffff81246c52>] __wait_on_buffer+0x32/0x40 [203520.515627] [<ffffffff812ef728>] jbd2_journal_commit_transaction+0xf48/0x1870 [203520.515634] [<ffffffff810ed10e>] ? try_to_del_timer_sync+0x5e/0x90 [203520.515640] [<ffffffff812f3d4a>] kjournald2+0xca/0x250 [203520.515643] [<ffffffff810c4330>] ? wake_atomic_t_function+0x60/0x60 [203520.515645] [<ffffffff812f3c80>] ? commit_timeout+0x10/0x10 [203520.515672] [<ffffffff810a0c25>] kthread+0xe5/0x100 [203520.515676] [<ffffffff810a0b40>] ? kthread_create_on_node+0x1e0/0x1e0 [203520.515680] [<ffffffff81840e0f>] ret_from_fork+0x3f/0x70 [203520.515682] [<ffffffff810a0b40>] ? kthread_create_on_node+0x1e0/0x1e0 [203520.516963] INFO: task mongod:9574 blocked for more than 120 seconds. [203520.517932] Not tainted 4.4.0-79-generic #100-Ubuntu [203520.518772] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [203520.520200] mongod D ffff8801e7d03bb8 0 9574 9052 0x00000000 [203520.520210] ffff8801e7d03bb8 ffff8801e7d03ba8 ffff880236238cc0 ffff8801f2ec8000 [203520.520212] ffff8801e7d04000 ffff88023fd16dc0 7fffffffffffffff ffffffff8183d150 [203520.520214] ffff8801e7d03d18 ffff8801e7d03bd0 ffffffff8183c955 0000000000000000 [203520.520216] Call Trace: [203520.520219] [<ffffffff8183d150>] ? bit_wait+0x60/0x60 [203520.520221] [<ffffffff8183c955>] schedule+0x35/0x80 [203520.520223] [<ffffffff8183faa5>] schedule_timeout+0x1b5/0x270 [203520.520234] [<ffffffff813cae86>] ? blk_flush_plug_list+0xd6/0x240 [203520.520237] [<ffffffff8106428e>] ? kvm_clock_get_cycles+0x1e/0x20 [203520.520238] [<ffffffff8183d150>] ? bit_wait+0x60/0x60 [203520.520240] [<ffffffff8183be84>] io_schedule_timeout+0xa4/0x110 [203520.520242] [<ffffffff8183d16b>] bit_wait_io+0x1b/0x70 [203520.520243] [<ffffffff8183ccfd>] __wait_on_bit+0x5d/0x90 [203520.520254] [<ffffffff8118e5cb>] wait_on_page_bit+0xcb/0xf0 [203520.520258] [<ffffffff810c4370>] ? autoremove_wake_function+0x40/0x40 [203520.520260] [<ffffffff8118e6e3>] __filemap_fdatawait_range+0xf3/0x160 [203520.520262] [<ffffffff81190581>] ? __filemap_fdatawrite_range+0xd1/0x100 [203520.520267] [<ffffffff8118e764>] filemap_fdatawait_range+0x14/0x30 [203520.520269] [<ffffffff811906cf>] filemap_write_and_wait_range+0x3f/0x70 [203520.520276] [<ffffffff81296461>] ext4_sync_file+0x101/0x350 [203520.520285] [<ffffffff812437cb>] vfs_fsync_range+0x4b/0xb0 [203520.520287] [<ffffffff8124388d>] do_fsync+0x3d/0x70 [203520.520291] [<ffffffff81243b43>] SyS_fdatasync+0x13/0x20 [203520.520293] [<ffffffff81840a72>] entry_SYSCALL_64_fastpath+0x16/0x71 [203520.520298] INFO: task mongod:9670 blocked for more than 120 seconds. [203520.521269] Not tainted 4.4.0-79-generic #100-Ubuntu [203520.522113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [203520.523259] mongod D ffff8801e7e138e8 0 9670 9052 0x00000000 [203520.523270] ffff8801e7e138e8 ffff88023423ac00 ffff880236238000 ffff880090de72c0 [203520.523271] ffff8801e7e14000 ffff88023fc96dc0 7fffffffffffffff ffffffff8183d150 [203520.523281] ffff8801e7e13a48 ffff8801e7e13900 ffffffff8183c955 0000000000000000 [203520.523283] Call Trace: [203520.523287] [<ffffffff8183d150>] ? bit_wait+0x60/0x60 [203520.523288] [<ffffffff8183c955>] schedule+0x35/0x80 [203520.523290] [<ffffffff8183faa5>] schedule_timeout+0x1b5/0x270 [203520.523293] [<ffffffff810bd195>] ? update_sd_lb_stats+0x115/0x530 [203520.523295] [<ffffffff8106428e>] ? kvm_clock_get_cycles+0x1e/0x20 [203520.523297] [<ffffffff8183d150>] ? bit_wait+0x60/0x60 [203520.523298] [<ffffffff8183be84>] io_schedule_timeout+0xa4/0x110 [203520.523300] [<ffffffff8183d16b>] bit_wait_io+0x1b/0x70 [203520.523302] [<ffffffff8183ccfd>] __wait_on_bit+0x5d/0x90 [203520.523303] [<ffffffff8183d150>] ? bit_wait+0x60/0x60 [203520.523305] [<ffffffff8183cdb2>] out_of_line_wait_on_bit+0x82/0xb0 [203520.523307] [<ffffffff810c4370>] ? autoremove_wake_function+0x40/0x40 [203520.523309] [<ffffffff812ed205>] do_get_write_access+0x245/0x490 [203520.523312] [<ffffffff812ed4a1>] jbd2_journal_get_write_access+0x51/0x70 [203520.523318] [<ffffffff812d04ab>] __ext4_journal_get_write_access+0x3b/0x80 [203520.523320] [<ffffffff81297ff1>] __ext4_new_inode+0x531/0x13a0 [203520.523325] [<ffffffff812aa8e9>] ext4_create+0x119/0x1b0 [203520.523331] [<ffffffff8121b937>] vfs_create+0x127/0x190 [203520.523334] [<ffffffff8121eb2c>] path_openat+0x120c/0x1330 [203520.523336] [<ffffffff8121fe41>] do_filp_open+0x91/0x100 [203520.523341] [<ffffffff812268f3>] ? dput+0x153/0x220 [203520.523343] [<ffffffff8122d786>] ? __alloc_fd+0x46/0x190 [203520.523348] [<ffffffff8120e2f8>] do_sys_open+0x138/0x2a0 [203520.523351] [<ffffffff810f6579>] ? do_gettimeofday+0x29/0x90 [203520.523353] [<ffffffff8120e47e>] SyS_open+0x1e/0x20 [203520.523355] [<ffffffff81840a72>] entry_SYSCALL_64_fastpath+0x16/0x71 [203530.969871] blk_update_request: I/O error, dev vdb, sector 21311856 [203530.971329] Aborting journal on device vdb-8. [203530.972092] EXT4-fs error (device vdb) in __ext4_new_inode:932: Journal has aborted [203530.972151] EXT4-fs error (device vdb) in __ext4_new_inode:932: Journal has aborted [203530.972197] EXT4-fs error (device vdb) in ext4_reserve_inode_write:5144: Journal has aborted [203535.100989] blk_update_request: I/O error, dev vdb, sector 278968 [203535.102396] EXT4-fs warning (device vdb): ext4_end_bio:329: I/O error -5 writing to inode 32 (offset 0 size 0 starting block 34872) [203535.102400] Buffer I/O error on device vdb, logical block 34871 [203535.103529] blk_update_request: I/O error, dev vdb, sector 295200 [203535.104679] EXT4-fs warning (device vdb): ext4_end_bio:329: I/O error -5 writing to inode 32 (offset 0 size 0 starting block 36901) [203535.104682] Buffer I/O error on device vdb, logical block 36900 [203554.089909] EXT4-fs error (device vdb): ext4_journal_check_start:56: Detected aborted journal [203554.095955] EXT4-fs (vdb): Remounting filesystem read-only [203554.102506] EXT4-fs error (device vdb): ext4_journal_check_start:56: Detected aborted journal [203554.119205] EXT4-fs error (device vdb): ext4_journal_check_start:56: Detected aborted journal [203554.122343] EXT4-fs error (device vdb) in ext4_reserve_inode_write:5144: Journal has aborted [203554.132027] EXT4-fs error (device vdb) in __ext4_new_inode:1121: Journal has aborted [203554.134754] EXT4-fs error (device vdb): ext4_journal_check_start:56: Detected aborted journal [203554.137333] EXT4-fs error (device vdb) in ext4_evict_inode:246: Journal has aborted
We 'fixed it' by unmounting and re-mounting the volume.
Mongod keeps restarting. It appears that the volume /mnt/vol mounted in the master VM from SIRIS went in read only: