akiradeveloper / dm-writeboost

Log-structured Caching for Linux
GNU General Public License v2.0
120 stars 18 forks source link

kernel/syslog message flood when disks are dropped off #208

Open allentkw opened 4 years ago

allentkw commented 4 years ago

Hi,

I have been using your module and so far it has been working great. Thank you for your work in this.


An issue that I have encountered:

a. When a HDD ( The cached block device ) is dropped off (For example: The disk is dead or someone pulled the disk out), the system will encounter a flood of kernel/syslog messages when will then be filling up the /var/log very quickly.

The host would then require a sysrq reboot due to /var/log being filled up with the following messages:

blk_partition_remap fail for partition 1

The setup that I have:

OS: Ubuntu 18.04.3 LTS
Kernel: 4.15.0-74-generic #84-Ubuntu SMP x86_64

lsblk (wb_sdi, wb_sdj are the writeboost devices and a lvm volume is created on top of it):

sdi                 8:128  0   1.8T  0 disk 
├─sdi1              8:129  0   1.8T  0 part 
│ └─wb_sdi        253:16   0   1.8T  0 dm   
│   └─vg_sdi-data 253:17   0   1.8T  0 lvm  
└─sdi2              8:130  0    10G  0 part 
sdj                 8:144  0   1.8T  0 disk 
├─sdj1              8:145  0   1.8T  0 part 
│ └─wb_sdk        253:18   0   1.8T  0 dm   
│   └─vg_sdk-data 253:20   0   1.8T  0 lvm  

writeboost settings:

writeback_threshold=100,sync_data_interval=3600

I am not sure whether if this would be the right place to be raising this and if this is expected

I was thinking that this issue (The flooding of the kernel messages) could be because when a cached block device is dropped suddenly, writeboost would still continue to try flushing the data back to the failed cached block device

A solution that I was thinking of, is to either suppress/filter out the "blk_partition_remap" messages so that the /var/log doesn't get filled up at least

Any ideas ?

Thanks