DLTcollab / tangle-accelerator

Accelerate IOTA transactions by caching API requests and redirecting to faster alternatives
MIT License
22 stars 16 forks source link

DB : Import historical data for Permanode into Scylladb #486

Open YingHan-Chen opened 4 years ago

YingHan-Chen commented 4 years ago

The first transaction in the latest dump file that IF released is from June 28, 2019 19:10:25 376G 1154472.dmp

In our own implementation, Scylladb takes similar disk usage as the dump files after importing. More space is for the commit log.

divide dump files

Divide large dump files to 1 million transations per division. 1 million transactions dump files taks 2.6G space.

YingHan-Chen commented 4 years ago

Detail of Compaction How to choose We use Size-tiered compaction strategy currently.

YingHan-Chen commented 4 years ago

io-conf-configuration-for-hdd-storage

when using Scylla with HDD storage, it is recommended to use RAID0 on all of your available disks, and manually update the io.conf configuration file max-io-request parameter. This parameter sets the number of concurrent requests sent to the storage. The value for this parameter should be 3X (3 times) the number of your disks. For example, if you have 3 disks, you would set max-io-request=9.

YingHan-Chen commented 4 years ago

Speed of scp from node8 to node1 to NAS

1154472.dmp_16  100% 2634MB  10.8MB/s   04:04    
1154472.dmp_85  100% 2634MB  10.8MB/s   04:03    
1154472.dmp_105  100% 2634MB  10.7MB/s   04:05    
1154472.dmp_118  100% 2634MB  10.7MB/s   04:05    
1154472.dmp_97 100% 2634MB  10.6MB/s   04:08    
1154472.dmp_86 100% 2634MB   7.3MB/s   06:01    
1154472.dmp_55 100% 2634MB   6.9MB/s   06:21    
1154472.dmp_63 100% 2634MB   7.2MB/s   06:05    
1154472.dmp_58 100% 2634MB   7.2MB/s   06:07    
1154472.dmp_56 100% 2634MB   7.2MB/s   06:07  
...
YingHan-Chen commented 4 years ago

I try to run seven importers parallelly. Each importer imports different pieces.

The requests that ScyllaDB serverd increase as follwoing. image image

image

YingHan-Chen commented 4 years ago

We are running two processes to get transaction data.

One importer process with 32 worker threads and one thread that reads from dump files and adds tasks to the task pool.

One listener process with 8 worker threads and one thread that listens from ZMQ sn event and adds tasks to the task pool.