googleapis / nodejs-storage

Node.js client for Google Cloud Storage: unified object storage for developers and enterprises, from live data serving to data analytics/ML to data archiving.
https://cloud.google.com/storage/
Apache License 2.0
888 stars 368 forks source link

File.save() with a Uint8Array fails #2478

Closed aiden-jeffrey closed 4 weeks ago

aiden-jeffrey commented 4 weeks ago

When saving a UintArray8, we see the error The "chunk" argument must be of type string or an instance of Buffer or Uint8Array.. This bug is only present in 7.0.0 onwards. I have to cast it to a buffer for the save to succeed, which wasn't necessary on 6.x.x.

Environment details

Steps to reproduce

  1. Bucket.file("some-file.bin").save(new Uint8Array([1,2,3,4,5]))
  2. See error:
    TypeError [ERR_INVALID_ARG_TYPE]: The "chunk" argument must be of type string or an instance of Buffer or Uint8Array. 
    Received type number (20)
    at new NodeError (node:internal/errors:400:5)
    at _write (node:internal/streams/writable:315:13)
    at Writable.write (node:internal/streams/writable:337:10)
    at pump (node:internal/streams/pipeline:130:21)
    at processTicksAndRejections (node:internal/process/task_queues:95:5) {
    code: 'ERR_INVALID_ARG_TYPE'
    }
    error Command failed with exit code 1.
ddelgrosso1 commented 4 weeks ago

@aiden-jeffrey did you also update node versions as part of the move from 6.x to 7.x? Looking at the signature for file.save in 6.9.3 it was defined as only accepting string | Buffer. We added PipelineSource in 7.x as another option.

If you updated Node versions perhaps there used to be an automatic conversion that was changed.

aiden-jeffrey commented 4 weeks ago

No. I didn't change node version. We upgraded to 7.11.1 (following direction here https://github.com/googleapis/nodejs-storage/issues/2476), and saw the bug. Then I bisected to work out which was the first release to exhibit - 7.0.0.

sidsethupathi commented 4 weeks ago

If it isn't clear, Aiden and I work together.

I also don't think 7.0.0 is the first bad commit. I checked out this repo, bisected the commits, and landed on https://github.com/googleapis/nodejs-storage/commit/c0d9d58b56a9a3485b6c0e5eb92411bb094f7bcb to be the commit that starts failing when passing in a Uint8Array, always using Node v18.13.0 during the bisect.

EDIT: More specifically, it looks like it's this conditional. Applying this diff makes the case mentioned in the issue work:

diff --git a/src/file.ts b/src/file.ts
index 6eed2ca..dc260a0 100644
--- a/src/file.ts
+++ b/src/file.ts
@@ -3679,7 +3679,7 @@ class File extends ServiceObject<File, FileMetadata> {
             reject(err);
           };

-          if (typeof data === 'string' || Buffer.isBuffer(data)) {
+          if (typeof data === 'string' || Buffer.isBuffer(data) || data instanceof Uint8Array) {
             writable
               .on('error', handleError)
               .on('finish', () => resolve())
ddelgrosso1 commented 4 weeks ago

@sidsethupathi beat me to it. I was just looking at those lines. It looks like Uint8Array worked previously because despite the TypeScript signature saying string | Buffer we never actually verified that data was actually one of those. Let me take a look at extending the signature and checks to accept further types.

sidsethupathi commented 4 weeks ago

The interesting thing is that we are using TypeScript so we'll look into why TS didn't yell at us for trying to pass in a Uint8Array into the File.save method.

ddelgrosso1 commented 4 weeks ago

When I tried in TypeScript just now, it wouldn't accept a UInt8Array.

typescript: 5.3.3 @types/node: 20.11.20 @google-cloud/storage@6.9.3

error TS2345: Argument of type 'Uint8Array' is not assignable to parameter of type 'SaveData'.
  Type 'Uint8Array' is missing the following properties from type 'Buffer': write, toJSON, equals, compare, and 66 more.

   const resp = await b.file('test').save(new Uint8Array([1,2,3,4,5]));
sidsethupathi commented 4 weeks ago

Yes, we have some abstract classes on top of our GCS invocations so I think something got messed up there.

In short, we have an abstract class

abstract class Writer {
  abstract saveBinary(data: Buffer | Uint8Array);
}

and then a GCS specific implementation

class GCSWriter extends Writer {
  saveBinary(data: Buffer) {
    // call File.save(data) here
  }
}

I think since Buffer extends Uint8Array, TS allows GCSWriter.saveBinary(data: Buffer) to satisfy Writer.saveBinary(data: Buffer | Uint8Array) because all Buffers will be Uint8Arrays, but then that borks the actual implementation because File.save doesn't want to take a Uint8Array.

EDIT:

Where this all falls apart is that that we never call GCSWriter.saveBinary directly, it all happens from within the abstract Writer class

abstract class Writer {
  abstract saveBinary(data: Buffer | Uint8Array);

  transform(foo: Foo) {
    const data: Uint8Array = convertFooToUint8Array(foo);
    this.saveBinary(data);
  }
}

const gcsWriter = new GCSWriter();
gcsWriter.transform(foo);
ddelgrosso1 commented 4 weeks ago

@sidsethupathi thank you for your investigation. I've opened a PR to add support for Uint8Array.

aiden-jeffrey commented 4 weeks ago

@sidsethupathi thank you for your investigation. I've opened a PR to add support for Uint8Array.

Thank you! I was sure I'd checked the type definition in SaveData, but apparently I didn't.