bergware / dynamix

51 stars 38 forks source link

File Integrity: speed up bash by reduce expensive calls #43

Closed Falcosc closed 3 years ago

Falcosc commented 3 years ago

async hashing only on console for files which are take longer than 1 sec call file stat only once

Benchmarks:

Branch Master with BLAKE2 added 178809 files. Duration: 02:20:39. Average speed: 57.4 MB/s exported 178809 files, skipped 0 files. Duration: 00:01:16 checked 178809 files, skipped 0 files. Found: 0 mismatches, 0 corruptions. Duration: 02:10:46. Average speed: 61.8 MB/s imported 178809 files, skipped 0 files. Duration: 00:16:06 cleared 178809 files, skipped 0 files. Duration: 00:13:11

Pullrequest with BLAKE2 added 178809 files. Duration: 01:34:01. Average speed: 85.9 MB/s exported 178809 files, skipped 0 files. Duration: 00:01:15 checked 178809 files, skipped 0 files. Found: 0 mismatches, 0 corruptions. Duration: 01:14:01. Average speed: 109 MB/s

Most important was the replacement of grep by bash string operations. Executing file stat only once does affect you if you have many small files, like in my benchmark.

Falcosc commented 3 years ago

Conditional syscall skip on verify does improve another 14%

without syscall skip: verified 178809 files, skipped 0 files. Found: 0 mismatches, 0 corruptions. Duration: 01:20:03. Average speed: 100 MB/s

unconditional skip: verified 178809 files, skipped 0 files. Found: 0 mismatches, 0 corruptions. Duration: 01:10:12. Average speed: 115 MB/s

conditional skip: verified 178809 files, skipped 0 files. Found: 0 mismatches, 0 corruptions. Duration: 01:10:43. Average speed: 114 MB/s