Open seekstar opened 2 years ago
Hi hi,
Is that fixed now? Could you try to use gdb or pstack to debug where it stucks?
"What's strange is that the database directory seems to be deleted:" This is because the data is moved to the raw NVMe SSD with SPDK, and removed from the native file system. From the output, you can find "start migration... Moved 101.46 MB", which means it's moving data from native FS to the raw NVMe SSD with SPDK.
"/home/searchstar/git/others/SpanDB/file/delete_scheduler.cc:74: delete /tmp/spandb/000009.log failed" This error could be ignored temporarily.
I'm also troubled by this problem. I initially think that this problem is caused by the function RequestScheduler::WorkerThread function not processing batch_write_queue. But I don't understand how to calculate the queue weights.
I'm also troubled by this problem. I initially think that this problem is caused by the function RequestScheduler::WorkerThread function not processing batch_write_queue. But I don't understand how to calculate the queue weights.
I also encountered the same problem with the read_queue during my testing
Hi,
Sorry for my late response. In case that I miss something, I simply paste all output here:
https://gist.github.com/seekstar/2b7171b6904c22a24799261498583982
I have no idea about this error:
nvme.c:1028:spdk_nvme_transport_id_parse: ERROR: Unknown transport ID key '01'
Here is the output of lspci
:
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/980PRO
The configurations are as follows.
Environment
SPDK
I installed SPDK v20.01.2 following the guide of SpanDB, and reserved 4GB huge pages:
Output:
Workload
The default
workloada
is too large, so I created a smaller variant ofworkloada
and saved asworkloada_1e5_2e4.spec
:I generate the DB with the command:
Output:
TopFS cache size
By default, the ycsb tester requires 90GB of cache, which exceeds the physical memory of my machine, so I modify
ycsb/src/test.cc
:And
make
to generate newtest
binary.Stuck at warmup
I run the test as root:
Output:
Then it stucks here, with 100% CPU usage.
What's strange is that the database directory seems to be deleted:
For comparison:
And there is also an error message in the output of
test
above:I don't know whether it matters.