dockerzhang / incubator-inlong

Apache InLong - a one-stop data streaming platform
https://inlong.apache.org/
Apache License 2.0
0 stars 0 forks source link

[INLONG-123] Batch flush data to disk #123

Closed dockerzhang closed 3 years ago

dockerzhang commented 3 years ago

4. More effective memory-to-disk operation: At present, the flashing operation is to flash messages from the memory to the disk one by one for storage. This block can be adjusted to write to the disk in batches according to the memory block, thereby improving storage efficiency;

 

------------------------------------------------------

This problem was pointed out by an MQ expert: in the current version, TubeMQ does not handle the data well enough when refreshing the data from memory to disk, It is carried out through the refresh mode one by one, the related problems are shown in the following figure:

 

 After searching for documents and analyzing this problem, This should be a better practice method that datas writing to disk according to the size of 4 times the number of bytes, but the difference between the specific size and the effect of flushing is related to the relevant operating environment.

So I want to optimize this to flush the datas to the disk in batches of a specified size (configurable) to improve disk write efficiency, and combined with the modification of TUBEMQ-120, it should be able to achieve better results

JIRA link - [INLONG-123] created by gosonzhang