Closed rhodgkins closed 7 months ago
@rhodgkins thanks for bringing up these issues. 1. Yes, the close should be awaited 2. You are correct about the return value it is currently mismatched. 3. I think your suggestion regarding a flag makes a lot of sense to prevent this from overrunning memory on extremely large files. Also, having a flag would prevent a breaking change which we try to avoid as much as possible as it requires a major version bump. 4. With regards to CRC32C let me take another look and circle back on this one.
We always appreciate contributions so if you want to raise a fix that would be great. If not, I can tackle these before we go into freeze for the holidays.
Thanks @ddelgrosso1 - first pass is #2373, I've done an educated guess at the CRC32C check...
I was going to use
TransferManager#downloadFileInChunks
to download a 20GB+ file and thought I'd investigate how it works (as the docs aren't that great around it - it's not the most clear where the file ends up).But I think there's multiple issues around the way the response is returned, the CRC32C validation and also memory usage.
https://github.com/googleapis/nodejs-storage/blob/7a96ce6f764076a14f0961623e2dec2ce8893dd7/src/transfer-manager.ts#L680
First the type of the
result
isn't correct, it'll returnBuffer[]
and not[Buffer]
(which is whatDownloadResponse
is).https://github.com/googleapis/nodejs-storage/blob/7a96ce6f764076a14f0961623e2dec2ce8893dd7/src/transfer-manager.ts#L682
Second the result of the
CRC32C
validation isn't used - shouldn't it be compared against thefileInfo
metadata's CRC32 value?Finally, related to the first issue, the function returns the complete contents of the file into memory (as an array of
Buffer
s`)? Won't this just cause OOM issues with any large GB sized file?Happy to do a PR to address the issues if the above assumptions are correct. In relation to the complete contents of the file being returned, could have a flag on
DownloadFileInChunksOptions
to signal that function should just resolve tovoid
? Or a lot simpler (but breaking), just always resolve tovoid
and let the user load the contents of the downloaded file into memory if that's what they want...Note, I've checked the tests out, and they're pretty basic only checking that
CRC32C.fromFile
is called, and not that the actual result is used in validation.EDIT:
https://github.com/googleapis/nodejs-storage/blob/7a96ce6f764076a14f0961623e2dec2ce8893dd7/src/transfer-manager.ts#L686
I've also noticed that the
FileHandle#close()
isn'tawait
ed and should probably be done after all the chunks are downloaded (ie. afterawait Promise.all(promises)
) and before the CRC32C is computed. Is this assumption correct and also needs to be changed?