ipinfo / mmdbctl

mmdbctl is an MMDB file management CLI supporting various operations on MMDB database files.
Apache License 2.0
93 stars 13 forks source link

import: buffer input file #11

Closed maxmouchet closed 1 year ago

maxmouchet commented 1 year ago

I noticed that a significant amount of time is spent in read syscalls inside the CSV reader.

This buffers read from the input file to reduce the number of syscalls. I've set the size of the buffer empirically, starting from 4kB until I've stopped seeing improvements.

cat data.csv | head -n 1000000 | ./mmdbctl import --csv --ip 6 --alias-6to4 --no-network --disallow-reserved > /dev/null
# Before: 25% of time spent in csv.Read
# After: 10% of time spent in csv.Read

CPU profiles: original buffered