scylladb / seastar

High performance server-side application framework
http://seastar.io
Apache License 2.0
8.36k stars 1.55k forks source link

iotune crashes on RHEL8 on GCE instance with an NVMe drive #679

Open vladzcloudius opened 5 years ago

vladzcloudius commented 5 years ago

Scylla version: 666.development-0.20190909.301246f6c VM: GCE instance with a locally attached NVMe drive.

$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  2
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU @ 2.30GHz
Stepping:            0
CPU MHz:             2300.000
BogoMIPS:            4600.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            46080K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat md_clear arch_capabilities

$ lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda       8:0    0  128G  0 disk 
└─sda1    8:1    0  128G  0 part /
nvme0n1 259:0    0  375G  0 disk /var/lib/scylla
Do you want IOTune to study your disks IO profile and adapt Scylla to it? (*WARNING* Saying NO here means the node will not boot in production mode unless you configure the I/O Subsystem manually!)
Yes - let iotune study my disk(s). Note that this action will take a few minutes. No - skip this step.
[YES/no]
tuning /sys/devices/pci0000:00/0000:00:04.0/nvme/nvme0/nvme0n1
tuning: /sys/devices/pci0000:00/0000:00:04.0/nvme/nvme0/nvme0n1/queue/nomerges 2
tuning /sys/devices/pci0000:00/0000:00:04.0/nvme/nvme0/nvme0n1
INFO  2019-09-11 16:15:37,661 [shard 0] iotune - /var/lib/scylla/commitlog passed sanity checks
Starting Evaluation. This may take a while...
Measuring sequential write bandwidth: 390 MB/s
Measuring sequential read bandwidth: 664 MB/s
Measuring random write IOPS: 100068 IOPS
Measuring random read IOPS: 180063 IOPS
iotune: /home/vladz/scylla/seastar/src/core/reactor.cc:2631: virtual seastar::append_challenged_posix_file_impl::~append_challenged_posix_file_impl(): Assertion `_closing_state == state::closed' failed.
Aborting on shard 0.
Backtrace:
  0x000000000063c582
  0x000000000055740b
  0x0000000000557705
  0x00000000005577a0
  0x00007f1bb3c2de7f
  /opt/scylladb/libreloc/libc.so.6+0x0000000000037e74
  /opt/scylladb/libreloc/libc.so.6+0x0000000000022894
  /opt/scylladb/libreloc/libc.so.6+0x0000000000022768
  /opt/scylladb/libreloc/libc.so.6+0x0000000000030565
  0x000000000058bc0f
  0x00000000004e6c85
  0x00000000004fa75d
  0x00000000004fa931
  0x00000000006889fe
  0x00000000006883a0
  0x00000000004ea5f1
  0x00000000005537b1
  0x00000000005539bf
  0x000000000060dbed
  0x000000000052f6b1
  0x00000000005302ce
  0x00000000004d6d98
  /opt/scylladb/libreloc/libc.so.6+0x0000000000023f32
  0x00000000004d8bad
ERROR:root:['/var/lib/scylla/data', '/var/lib/scylla/commitlog'] did not pass validation tests, it may not be on XFS and/or has limited disk space.
This is a non-supported setup, and performance is expected to be very bad.
For better performance, placing your data on XFS-formatted directories is required.
To override this error, enable developer mode as follow:
sudo /opt/scylladb/scripts/scylla_dev_mode_setup --developer-mode 1
IO configuration setup failed. Press any key to continue...
mykaul commented 6 months ago

I believe @pwrobelse fixed this recently.

pwrobelse commented 6 months ago

I believe @pwrobelse fixed this recently.

The log seems to indicate, that the problem is related to closing the file - not to io-depth underflow:

iotune: /home/vladz/scylla/seastar/src/core/reactor.cc:2631: virtual seastar::append_challenged_posix_file_impl::~append_challenged_posix_file_impl(): Assertion `_closing_state == state::closed' failed.

My PR fixed only per-shard io-depth underflow.

As far as I remember @xemul fixed an issue related to calling close on the file that was not opened. @xemul: is the following problem similar to the one, that PR#1623 fixed?

xemul commented 6 months ago

@xemul: is the following problem similar to the one, that https://github.com/scylladb/seastar/pull/1623 fixed?

No, that's different. But append-challenged-posix-file had changed a lot since 2019, I think it's better to close this issue and get back if it happens again