ClickHouse / libhdfs3

HDFS file read access for ClickHouse
Apache License 2.0
36 stars 56 forks source link

Speed performance of crc32c algorithm while reading blocks from datanode #22

Closed taiyang-li closed 2 years ago

taiyang-li commented 2 years ago

Adding IntelAsmCrc32c algorithm(https://github.com/htot/crc32c), which is at least 2x faster than current algorithms.

$ ./build_gcc/contrib/libhdfs3-cmake/perf_checksum  
HWCrc32c    1.04966 2039474432
IntelAsmCrc32c  0.367757    2039474432
SWCrc32c    16.9265 2039474432
taiyang-li commented 2 years ago

Before improvement, CRC32 checksum HWCrc32c::update tasks the most time in RemoteBlockReader::readNextPacket, which is frequently executed when reading blocks from datanode

image