This PR speeds up read.stream and read by skipping fs.stat call if size was passed via opts. Currently, the only reason for doing a stat call is to get the size (and throw the size mismatch error if the size is different). This is unnecessary for 3 reasons:
In the case of read.stream, the stream already compares the sizes at the end and throws an error if there's a mismatch.
In the case of read, we can compare the sizes after reading the cache contents
In both cases we are already doing an integrity check which would automatically fail if there's a size difference since the hashes would be different.
In this PR, the stat call is only made if the user does not pass a size property via opts. This makes sense because without knowing the size, the stream has to make an unnecessary fs.read call at the end before closing which has a significant cost (that cost is much, much greater than the cost of doing fs.stat).
On my machine, the benchmarks with this change look like this:
This PR speeds up
read.stream
andread
by skippingfs.stat
call ifsize
was passed viaopts
. Currently, the only reason for doing astat
call is to get the size (and throw the size mismatch error if the size is different). This is unnecessary for 3 reasons:read.stream
, the stream already compares the sizes at the end and throws an error if there's a mismatch.read
, we can compare the sizes after reading the cache contentsIn this PR, the
stat
call is only made if the user does not pass asize
property viaopts
. This makes sense because without knowing thesize
, the stream has to make an unnecessaryfs.read
call at the end before closing which has a significant cost (that cost is much, much greater than the cost of doingfs.stat
).On my machine, the benchmarks with this change look like this:
That's a solid 16% improvement in the case of
read.stream
and 36% improvement in the case ofread
.References