Open YingHan-Chen opened 4 years ago
Detail of Compaction How to choose We use Size-tiered compaction strategy currently.
io-conf-configuration-for-hdd-storage
when using Scylla with HDD storage, it is recommended to use RAID0 on all of your available disks, and manually update the io.conf configuration file max-io-request parameter. This parameter sets the number of concurrent requests sent to the storage. The value for this parameter should be 3X (3 times) the number of your disks. For example, if you have 3 disks, you would set max-io-request=9.
Speed of scp
from node8 to node1 to NAS
1154472.dmp_16 100% 2634MB 10.8MB/s 04:04
1154472.dmp_85 100% 2634MB 10.8MB/s 04:03
1154472.dmp_105 100% 2634MB 10.7MB/s 04:05
1154472.dmp_118 100% 2634MB 10.7MB/s 04:05
1154472.dmp_97 100% 2634MB 10.6MB/s 04:08
1154472.dmp_86 100% 2634MB 7.3MB/s 06:01
1154472.dmp_55 100% 2634MB 6.9MB/s 06:21
1154472.dmp_63 100% 2634MB 7.2MB/s 06:05
1154472.dmp_58 100% 2634MB 7.2MB/s 06:07
1154472.dmp_56 100% 2634MB 7.2MB/s 06:07
...
I try to run seven importers parallelly. Each importer imports different pieces.
The requests that ScyllaDB serverd increase as follwoing.
We are running two processes to get transaction data.
One importer process with 32 worker threads and one thread that reads from dump files and adds tasks to the task pool.
One listener process with 8 worker threads and one thread that listens from ZMQ sn event and adds tasks to the task pool.
The first transaction in the latest dump file that IF released is from June 28, 2019 19:10:25 376G 1154472.dmp
In our own implementation, Scylladb takes similar disk usage as the dump files after importing. More space is for the commit log.
divide dump files
Divide large dump files to 1 million transations per division. 1 million transactions dump files taks 2.6G space.