Open travisdowns opened 7 years ago
My intention was to use memcpy
only for the first file to show CPU/mem speed that we are dealing with.
On my laptop it goes up to 8719 MB/s with silesia.tar
.
I see, I guess it makes sense. One caveat is that memcpy
speed is very dependent on the size of the underlying buffer - for large buffers (i.e., larger than L3) it will generally converge to something close to the underlying DRAM read + write bandwidth, while for smaller files, it could be an order of magnitude more (e.g., 100 GB/s) if it fits in L1, L2, etc.
So you can get weird results, like if the first file is big, you might get a RAM-bound memcpy
figure like 11 GB/s on my box, but then for smaller files super-fast compressors (lets say one that just uses memcpy
internally to copy the whole buffer at zero compression) could get a much larger value which doesn't make much sense.
Currently I guess it isn't much of an issue because most compression algos are CPU-bound at a speed lower than memory bandwidth and perform generally the same regardless of whether the working set fits in cache or not, so it's not too visible...
I guess I would kind of expect the memcpy
"codec" to just act like another codec, and obey the same parameters and behaviors, rather than being special cased like it is today in the code (probably this would also reduce the code complexity).
When using
To run against a directory,
memcpy
only runs once:Presumably the intent is for
memcpy
to run against all the files, like the other algos.