Closed JvdW closed 8 years ago
Is there a stack dump on tty1 or in dmesg?
On 13-11-2016 16:00, Petros Koutoupis wrote:
Is there a stack dump on tty1 or in dmesg?
No, problem is that the VM becomes completely unresponsive, cpu usage goes to 100% and only way to get back in is to power it down meaning loss of all unwritten data. So messages looks like the attached file. Logged in this morning and did:
- rapiddisk --attach 256 # VM has 2G RAM and is idle
- rapiddisk --cache-map rd0 /dev/vda1
- mount /dev/mapper/rc-wt_vda1 /cache
- cp -pr ./geoserver /cache # copies quite a bit files, but don't know how much
- power off
- restart
Anything I can do to debug this?
Joop
Nov 14 07:47:03 rpmrepo01 puppet-agent[27875]: Finished catalog run in 9.38 seconds
Nov 14 08:17:07 rpmrepo01 puppet-agent[29283]: Finished catalog run in 11.33 seconds
Nov 14 08:44:58 rpmrepo01 kernel: rapiddisk: Attached rd0 of 268435456 bytes in size.
Nov 14 08:46:23 rpmrepo01 kernel: device-mapper: rapiddisk-cache: Allocate 512KB (8B per) mem for 65536-entry cache(capacity:256MB, associativity:512, block size:8 sectors(4KB))
Nov 14 08:46:23 rpmrepo01 kernel: ------------[ cut here ]------------
Nov 14 08:46:23 rpmrepo01 kernel: WARNING: at kernel/softirq.c:159 local_bh_enable_ip+0x7b/0xa0() (Not tainted)
Nov 14 08:46:23 rpmrepo01 kernel: Hardware name: oVirt Node
Nov 14 08:46:23 rpmrepo01 kernel: Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ext2 rapiddisk_cache(U) rapiddisk(U) dm_crypt virtio_console virtio_net i2c_piix4 i2c_core sg ext4 jbd2 mbcache virtio_blk sr_mod cdrom virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Nov 14 08:46:23 rpmrepo01 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-642.6.2.el6.x86_64 #1
Nov 14 08:46:23 rpmrepo01 kernel: Call Trace:
Nov 14 08:46:23 rpmrepo01 kernel:
On 13-11-2016 16:00, Petros Koutoupis wrote:
Is there a stack dump on tty1 or in dmesg?
So, right after I send the last message I had an idea. What if its related to how the disk is presented to the VM. There are 2 way and the default is VirtIO and gives you a /dev/vda, changing that to VirtIO-SCSI gives you a /dev/sda. I just did that and no problems :-) Look like the driver implementation of VirtIO is 'buggy', bad news is that this is the default and I don't know if and what differences there are between VirtIO and Virtio-SCSI. Will ask around in IRC and the ovirt forum for that.
Joop
Thank you very much for all of this and for your initial investigation. Note, there is nothing different being done in the RapidDisk code for SCSI devices. The code is designed to work with all block devices (loop, NVMe, etc.). I will need to recreate your setup locally over here, to capture a kdump (for crash debugging). But in the meanwhile, can you see if there is a difference between the following:
Thank you again.
On 14-11-2016 14:54, Petros Koutoupis wrote:
Thank you very much for all of this and for your initial investigation. Note, there is nothing different being done in the RapidDisk code for SCSI devices. The code is designed to work with all block devices (loop, NVMe, etc.). I will need to recreate your setup locally over here, to capture a kdump (for crash debugging). But in the meanwhile, can you see if there is a difference between the following:
- Using Ext4 instead of XFS
Same. root@rpmrepo01:/ # ll /mnt/rc_vdb1/ total 1536020 drwx------. 2 root root 16384 Nov 14 15:59 lost+found -rw-r--r--. 1 root root 1572864000 Nov 14 16:14 ssd.test.file root@rpmrepo01:/ # fio /usr/share/doc/fio-2.0.13/examples/ssd-test seq-read: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=4 rand-read: (g=1): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=4 seq-write: (g=2): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=4 rand-write: (g=3): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=4 fio-2.0.13 Starting 4 processes
- Running fio on the /dev/vda volume without RapidDisk/RapidDisk-Cache
No problem running fio in disks which are either vda or sda without rapiddisk caching.
Also, what is the output of the VMs /proc/partitions? I want to make sure that cache mapping size is identical to the /dev/vda size.
252 0 47185920 vda 252 1 204800 vda1 252 2 46980096 vda2 8 0 5242880 sda 8 1 5241856 sda1 253 0 6144000 dm-0 253 1 40833024 dm-1 253 2 5241856 dm-2 252 16 2097152 vdb 252 17 2096128 vdb1 253 3 2096128 dm-3
root@rpmrepo01:~ # fdisk -luc /dev/vdb
Disk /dev/vdb: 2147 MB, 2147483648 bytes 1 heads, 16 sectors/track, 262144 cylinders, total 4194304 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xc5e829de
Device Boot Start End Blocks Id System /dev/vdb1 2048 4194303 2096128 83 Linux
Added a test disk of 2G to the VM using VirtIO so --> /dev/vdb fdisk -uc /dev/vdb (create 1 primary partition spanning sector 2048 til end) mkfs.ext4 /dev/vdb1 rapiddisk --attach 64 rapiddisk --cache-map rd1 /dev/vdb1 mount /dev/mapper/rc-wt_vdb1 /mnt/rc-vdb1 run fio and it will make the VM unresponsive en crash it without leaving a trace when restarting it :-(
Now I changed the disk from VirtIO to VirtIO-SCSI --> /dev/sdb rapiddisk --attach 64 rapiddisk --cache-map rd0 /dev/sdb1 mount /dev/mapper/rc-wt_sdb1 /mnt/rc-vdb1 run fio and no problems.
Part of the output from virsh -r dumpxml rpmrepo01 so that you can see whats passed to libvirt:
<disk type='file' device='disk' snapshot='no'>
<driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/>
<source file='/rhev/data-center/mnt/nfs01.stor.nieuwland.nl:_export_ovirtproductie/fbaffdc6-9d4d-454f-975b-9516faf927c6/images/0450f5f3-44fa-46ea-8bd6-9b820d0208bf/0addfcf7-8867-45c2-bd07-aa35afe64b85'>
<seclabel model='selinux' relabel='no'/>
</source>
<target dev='vda' bus='virtio'/>
<serial>0450f5f3-44fa-46ea-8bd6-9b820d0208bf</serial>
<boot order='1'/>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</disk>
<disk type='file' device='disk' snapshot='no'>
<driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/>
<source file='/rhev/data-center/mnt/nfs02.nieuwland.nl:_nfs_lv__ovirt__infradata/7f97cf40-7733-40f0-bfaf-5f017f36acac/images/3d3e5285-230a-4843-bf55-c3cd85f7297a/5ecc4d0d-d78d-4414-b71a-c715910fa58d'>
<seclabel model='selinux' relabel='no'/>
</source>
<target dev='sda' bus='scsi'/>
<serial>3d3e5285-230a-4843-bf55-c3cd85f7297a</serial>
<alias name='scsi0-0-0-0'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<controller type='scsi' index='0' model='virtio-scsi'>
<alias name='scsi0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</controller>
<controller type='usb' index='0'>
<alias name='usb0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
</controller>
<controller type='ide' index='0'>
<alias name='ide0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<controller type='virtio-serial' index='0'>
<alias name='virtio-serial0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</controller>
The arguments passed to qemu by oVirt: qemu 4653 1 20 16:18 ? 00:03:10 /usr/libexec/qemu-kvm -name rpmrepo01 -S -M rhel6.5.0 -cpu Westmere -enable-kvm -m 2048 -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 -uuid 69477ca3-1d45-4c5a-9d6e-0eeb4ff52286 -smbios type=1,manufacturer=oVirt,product=oVirt Node,version=6-5.el6.centos.11.2,serial=34333336-3730-5A43-3232-3038304E3452,uuid=69477ca3-1d45-4c5a-9d6e-0eeb4ff52286 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rpmrepo01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2016-11-14T15:18:28,driftfix=slew -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/rhev/data-center/mnt/nfs01.stor.nieuwland.nl:_export_ovirtproductie/fbaffdc6-9d4d-454f-975b-9516faf927c6/images/0450f5f3-44fa-46ea-8bd6-9b820d0208bf/0addfcf7-8867-45c2-bd07-aa35afe64b85,if=none,id=drive-virtio-disk0,format=raw,serial=0450f5f3-44fa-46ea-8bd6-9b820d0208bf,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/rhev/data-center/mnt/nfs01.stor.nieuwland.nl:_export_ovirtproductie/fbaffdc6-9d4d-454f-975b-9516faf927c6/images/ba96e053-bac8-4b31-af40-8d9105dfc053/4aa4682c-82da-4fe1-a44e-0205b05ae78e,if=none,id=drive-virtio-disk1,format=raw,serial=ba96e053-bac8-4b31-af40-8d9105dfc053,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 -drive file=/rhev/data-center/mnt/nfs02.nieuwland.nl:_nfs_lvovirtinfradata/7f97cf40-7733-40f0-bfaf-5f017f36acac/images/3d3e5285-230a-4843-bf55-c3cd85f7297a/5ecc4d0d-d78d-4414-b71a-c715910fa58d,if=none,id=drive-scsi0-0-0-0,format=raw,serial=3d3e5285-230a-4843-bf55-c3cd85f7297a,cache=none,werror=stop,rerror=stop,aio=threads -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 -netdev tap,fd=34,id=hostnet0,vhost=on,vhostfd=39 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:10:10:49,bus=pci.0,addr=0x3 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/69477ca3-1d45-4c5a-9d6e-0eeb4ff52286.com.redhat.rhevm.vdsm,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/69477ca3-1d45-4c5a-9d6e-0eeb4ff52286.org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel2,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0 -spice port=5911,tls-port=5914,addr=0,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on -k en-us -vga qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=33554432
Joop
Thank you for all of this. Again, very soon I will stand up a VM and attempt to reproduce this. I have one additional request: can you rerun the test with a dm-linear mapping on top of the vda device? http://tldp.org/HOWTO/LVM-HOWTO/createlv.html
Based on the stack dump you provided above, it would seem that the part showcasing the problem is in the asynchronous callback handler of rapiddisk-cache. Now, with dm-linear, all I/O operations are synchronous and I want to rule out general device mapper operation.
On 14-11-2016 22:27, Petros Koutoupis wrote:
Thank you for all of this. Again, very soon I will stand up a VM and attempt to reproduce this. I have one additional request: can you rerun the test with a dm-linear mapping on top of the vda device? http://tldp.org/HOWTO/LVM-HOWTO/createlv.html
Based on the stack dump you provided above, it would seem that the part showcasing the problem is in the asynchronous callback handler of rapiddisk-cache. Now, with dm-linear, all I/O operations are synchronous and I want to rule out general device mapper operation.
Done as asked but the same problem, after starting the test it will make the VM unresponsive and after about 20-30 secs it will disconnect and then I see 100% CPU usage in oVirt and I can only hard poweroff the VM.
Log of what I did is attached.
Joop
root@rpmrepo01:/ # umount /mnt/rc_vdb1 root@rpmrepo01:/ # fdisk -l
Disk /dev/vda: 48.3 GB, 48318382080 bytes 16 heads, 63 sectors/track, 93622 cylinders Units = cylinders of 1008 * 512 = 516096 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0001795e
Device Boot Start End Blocks Id System /dev/vda1 * 3 409 204800 83 Linux Partition 1 does not end on cylinder boundary. /dev/vda2 409 93623 46980096 83 Linux Partition 2 does not end on cylinder boundary.
Disk /dev/sda: 5368 MB, 5368709120 bytes 9 heads, 40 sectors/track, 29127 cylinders Units = cylinders of 360 * 512 = 184320 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xd4cd3997
Device Boot Start End Blocks Id System /dev/sda1 6 29128 5241856 83 Linux
Disk /dev/mapper/vg_rpmrepo01-lv_root: 6291 MB, 6291456000 bytes 255 heads, 63 sectors/track, 764 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000
Disk /dev/mapper/vg_rpmrepo01-lv_var: 41.8 GB, 41813016576 bytes 255 heads, 63 sectors/track, 5083 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000
Disk /dev/vdb: 4294 MB, 4294967296 bytes 16 heads, 63 sectors/track, 8322 cylinders Units = cylinders of 1008 * 512 = 516096 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000
root@rpmrepo01:/ # fdisk -uc /dev/vdb Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel Building a new DOS disklabel with disk identifier 0x1be62784. Changes will remain in memory only, until you decide to write them. After that, of course, the previous content won't be recoverable.
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First sector (2048-8388607, default 2048): Using default value 2048 Last sector, +sectors or +size{K,M,G} (2048-8388607, default 8388607): Using default value 8388607
Command (m for help): w The partition table has been altered!
Calling ioctl() to re-read partition table. Syncing disks. root@rpmrepo01:/ # pvcreate /dev/vdb1 Physical volume "/dev/vdb1" successfully created root@rpmrepo01:/ # vgcreate rapiddisk /dev/vdb1 Volume group "rapiddisk" successfully created root@rpmrepo01:/ # lvcreate -L3000 -n rapidlv rapiddisk Logical volume "rapidlv" created. root@rpmrepo01:/ # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sr0 11:0 1 1024M 0 rom vda 252:0 0 45G 0 disk ??vda1 252:1 0 200M 0 part /boot ??vda2 252:2 0 44.8G 0 part ??vg_rpmrepo01-lv_root (dm-0) 253:0 0 5.9G 0 lvm / ??vg_rpmrepo01-lv_var (dm-1) 253:1 0 39G 0 lvm /var vdb 252:16 0 4G 0 disk ??vdb1 252:17 0 4G 0 part ??rapiddisk-rapidlv (dm-2) 253:2 0 3G 0 lvm sda 8:0 0 5G 0 disk ??sda1 8:1 0 5G 0 part rd0 251:0 0 64M 0 disk root@rpmrepo01:/ # mkfs.ext4 /dev/mapper/rapiddisk-rapidlv mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 192000 inodes, 768000 blocks 38400 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=788529152 24 block groups 32768 blocks per group, 32768 fragments per group 8000 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912
Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 38 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. root@rpmrepo01:/ # rapiddisk --list rapiddisk 4.4 Copyright 2011-2016 Petros Koutoupis
List of RapidDisk device(s):
RapidDisk Device 1: rd0 Size (KB): 65536
List of RapidDisk-Cache mapping(s):
None
root@rpmrepo01:/ # rapiddisk --cache-map rd0 /dev/mapper/rapiddisk-rapidlv rapiddisk 4.4 Copyright 2011-2016 Petros Koutoupis
Command to map rc-wt_rapiddisk-rapidlv with rd0 and /dev/mapper/rapiddisk-rapidlv has been sent. Verify with "--list"
root@rpmrepo01:/ # rapiddisk --list rapiddisk 4.4 Copyright 2011-2016 Petros Koutoupis
List of RapidDisk device(s):
RapidDisk Device 1: rd0 Size (KB): 65536
List of RapidDisk-Cache mapping(s):
RapidDisk-Cache Target 1: rc-wt_rapiddisk-rapidlv Cache: rd0 Target: dm-2 (WRITE THROUGH)
root@rpmrepo01:/ # mount /dev/mapper/rc-wt_rapiddisk-rapidlv /mnt/rc_vdb1/ root@rpmrepo01:/ # df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_rpmrepo01-lv_root 5.7G 5.3G 132M 98% / tmpfs 938M 0 938M 0% /dev/shm /dev/vda1 194M 155M 30M 85% /boot /dev/mapper/vg_rpmrepo01-lv_var 39G 31G 6.2G 83% /var /dev/mapper/rc-wt_rapiddisk-rapidlv 2.9G 4.5M 2.7G 1% /mnt/rc_vdb1 root@rpmrepo01:/ # fio /usr/share/doc/fio-2.0.13/examples/ssd-test seq-read: (g=0): rw=read, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=4 rand-read: (g=1): rw=randread, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=4 seq-write: (g=2): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=4 rand-write: (g=3): rw=randwrite, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio, iodepth=4 fio-2.0.13 Starting 4 processes seq-read: Laying out IO file(s) (1 file(s) / 1500MB)
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
Session stopped
Network error: Software caused connection abort
I have the same problem using Ubuntu 16.04 and the 4.8 and 4.8.4 kernel that lockup the computer when running the rapiddisk --cache-map command. The 4.7.5 kernel works fine using the same Ubuntu 16.04 computer.
ttpaus
Thank you. Everyone's feedback is greatly appreciated. Although in my request, I was hoping to see the results of an Linear LVM volume without rapiddisk-cache. Just mount the LVM volume and run fio to it.
On 15-11-2016 15:47, Petros Koutoupis wrote:
Thank you. Everyone's feedback is greatly appreciated. Although in my request, I was hoping to see the results of an Linear LVM volume without rapiddisk-cache. Just mount the LVM volume and run fio to it.
Sorry, but that just works. / is mounted on a lvm volume since its a standard Centos-6 installation.
Joop
ttpaus,
Please pull from the latest source code. I believe I addressed the 4.8 kernel bug.
Joop,
I am looking into this now.... I should have something soon.
It still crashes the 4.8.5 kernel. Tested using Ubuntu 16.04 with kernel 4.8.5 (Tested on the host. No VM) Work perfectly on kernel 4.7.5 ttpaus
ttpaus,
Can you provide the output of the the following commands:
$ sudo modinfo rapiddisk $ sudo modinfo rapiddisk-cache
I found the error. The 4.8.5 kernel refused to upgrade the rapiddisk module so it was still 4.4. Tried reinstall multiple times. I dont have time to find out why so I upgraded to the the latest kernel 4.8.8, rebuild and reinstalled rapiddisk and it worked perfectly. Thanks for the help ttpaus
Joop,
I have got some good news: I reproduced it and am not root causing it.
On 24-11-2016 0:29, Petros Koutoupis wrote:
Joop,
I have got some good news: I reproduced it and am not root causing it.
I thought it wouldn't be rapiddisk since /dev/sda works and /dev/vda didn't but now you need to convince upstream that there is a bug :-)
I'll await what happens next and in the meantime will use /dev/sda when possible.
Thanks for updateing,
Joop
Oops. I meant to say "now root causing it" and not "not root causing it."
Anyway, it seems that virtio has issues with the type of spinlocks I am using. I need to figure out why.
Joop,
I pushed a branch that is using a different type of spinlock. In my tests, it works. Can you please validate that this is the case for you as well. If you are not familiar with the process, clone the master repo (with git) and switch to the branch: defect/virtio_spinlocks.
$ git checkout defect/virtio_spinlocks
You can validate the branch that you are on with the following command:
$ git branch
* defect/virtio_spinlocks
master
Then run a "make" and insmod the rapiddisk-cache.ko module in the module/ directory. No lock up in my KVM guest (with virtio block device) and fio is running the exact same example script you were using (ssd-test.fio).
I am not saying that this is the fix but it will give me a better idea of why the other method is so problematic.
On 24-11-2016 15:13, Petros Koutoupis wrote:
Joop,
I pushed a branch that is using a different type of spinlock. In my tests, it works. Can you please validate that this is the case for you as well. If you are not familiar with the process, clone the master repo (with git) and switch to the branch: defect/virtio_spinlocks.
|$ git checkout defect/virtio_spinlocks|
Then run a "make" and insmod the rapiddisk-cache.ko module in the module/ directory. No lock up in my KVM guest (with virtio block device) and fio is running the exact same example script you were using (ssd-test.fio).
I am not saying that this is the fix but it will give me a better idea of why the other method is so problematic.
Will try to do that today or tomorrow if no disasters show up.
Joop
Just ran the fio test to completion using virtio :-)
Joop,
That is great news. Thank you very much for the feedback. I should have something official pushed shortly. When I do, I will close this ticket.
Officially merged into Release 5.0
commit d31d2e6822ae9d73dee15167b64af1584e2c08f0 Author: Petros Koutoupis petros@petroskoutoupis.com Date: Thu Nov 24 07:58:40 2016 -0600
Issue 13: Crashes while using it in a VM
Its not stated explicitly but I think it should work when used inside a VM. I tried that multiple times but they all end with a lockup of the VM. Tried:
Q: should this work or is there a problem with the different setups that I have tried?
Regards,
Joop