googleapis / nodejs-storage

Node.js client for Google Cloud Storage: unified object storage for developers and enterprises, from live data serving to data analytics/ML to data archiving.
https://cloud.google.com/storage/
Apache License 2.0
903 stars 370 forks source link

Error 502 (Server Error)!!1 #1780

Closed rodrigoguedes closed 2 years ago

rodrigoguedes commented 2 years ago

Hi everyone, I'm getting intermittent 502 errors when I try to upload large files using stream (larger than 10Gb).

Environment details OS: Ubuntu 16.04 Node.js version: 14.17.5 @google-cloud/storage version: 5.15.0

Steps to reproduce

    const {stderr, stdout} = spawn(
      "tar",
      ["-c", "--use-compress-program=pigz", "."],
      {
        cwd: "/tmp/test"
      }
    );

    const googleAuth = new GoogleAuth({
      credentials: $MY_GCP_KEY || {},
      scopes: [
        "https://www.thiss.com/auth/cloud-platform",
        "https://www.thiss.com/auth/sqlservice.admin"
      ]
    });

    const auth = await googleAuth.getClient();

    const storage = new Storage({
      credentials: {
        client_email: auth.email,
        private_key: auth.key
      },
      projectId: $MY_CLOUD_PROJECT,
      retryOptions: {
        autoRetry: true,
        maxRetries: 3,
        totalTimeout: 3600000
      }
    });

    let bytesWritten = 0;

    const passThroughStream = new PassThrough();

    passThroughStream.on("data", chunk => {
      bytesWritten += Buffer.byteLength(chunk);
    });

    let streams = [stdout.pipe(new PassThrough()), passThroughStream];

    //Client side encryption
    streams.push(crypto.createCipheriv("aes-256-cbc", Buffer.alloc(32, "test123"), Buffer.alloc(16, "test")));

    const bucket = storage.bucket($MY_BUCKET);

    const gcpFile = bucket.file("folder/my_file.tgz", {});

    const metadata = {
      "client-side-encryption": true
    };

    const gcpWriteStream = gcpFile.createWriteStream({
      metadata: {
        metadata
      }
    });

    streams.push(gcpWriteStream);

    const pipeline = util.promisify(stream.pipeline);

    await pipeline(streams);

    await gcpFile.setMetadata({
      metadata: {
        "file-size": bytesWritten
      }
    });

First error (this can happen anytime, after 30 minutes or after 90 minutes)

502. That’s an error.

The server encountered a temporary error and could not complete your request.

Please try again in 30 seconds. That’s all we know.

So after 6 minutes

408. That’s an error.

Your client has taken too long to issue its request. That’s all we know.
danielbankhead commented 2 years ago

Hey @rodrigoguedes, thanks for your inquiry. Looking at your snippet, I have a few suggestions:

danielbankhead commented 2 years ago

Closing for now due to inactivity. We can reopen for future debugging if necessary.

rodrigo-obj commented 2 years ago

after changing parameters and updating to the last version, I'm getting this error:

Caused by:  RangeError: The offset is lower than the number of bytes written
    at Upload.startUploading (/myapp/node_modules/@google-cloud/storage/build/src/gcs-resumable-upload/index.js:392:32)
    at Upload.continueUploading (/myapp/node_modules/@google-cloud/storage/build/src/gcs-resumable-upload/index.js:373:14)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (internal/process/task_queues.js:95:5)
danielbankhead commented 2 years ago

Which parameters were changed?

That particular RangeError occurs when there’s both an interrupted connection and the server is behind the number of bytes written. To confirm, this module is being used to upload to GCS and not a custom endpoint?

If this particular error occurs often for you, assuming large (multi-GB+) uploads to GCS, it can be remedied by using chunkSize as it would prepare and cache a buffer from a provided for uploading, re-uploading the cached chunk on retries and connection errors. See:

rodrigo-obj commented 2 years ago

Hi @danielbankhead thanks for answering, so then changes in Storage Object were:

retryOptions.maxRetries was increased to 10
retryOptions.totalTimeout kept in 3600000 "I know it's a big value, we only use it to test :) "

To confirm, this module is being used to upload to GCS and not a custom endpoint?

yes, I am sending direct to GCS, but I'm using stream.pipeline to sequence my streams (generate tar and after encrypt), maybe one of the streams might be taking too long and passing the maxRetryDelay limit, or maybe my stream const {stderr, stdout} = spawn("tar"... is not able to support resumable-uploads, does it make sense?

If this particular error occurs often for you, assuming large (multi-GB+) uploads to GCS, it can be remedied by using chunkSize as it would prepare and cache a buffer from a provided for uploading, re-uploading the cached chunk on retries and connection errors. See:

I'll take a look at it, Thanks

danielbankhead commented 2 years ago

Thanks for the additional information, with that I’m confident the chunkSize option will remedy your issue as it will handle retried uploads in your use case in a more robust manner.

I’ll leave this reopened for a week for verification.

rodrigo-obj commented 2 years ago

@danielbankhead is there some default value for chunkSize? or normally have the same size of the file?

danielbankhead commented 2 years ago

We recommend using at least 8 MiB for the chunk size.

Additional details: https://cloud.google.com/storage/docs/performing-resumable-uploads#chunked-upload

rodrigo-obj commented 2 years ago

Unfortunately using chunkSize I had other problems and the transfer speed drops drastically

FetchError: request to https://storage.googleapis.com/upload/storage/v1/b/my_project/o?name=folder%2Fmy_file.tgz&uploadType=resumable&upload_id=aaaaaaaaaaaaaaaa failed, reason: socket hang up
    at ClientRequest.&lt;anonymous> (/my_app/node_modules/node-fetch/lib/index.js:1491:11)
    at ClientRequest.emit (events.js:412:35)
    at TLSSocket.socketOnEnd (_http_client.js:499:9)
    at TLSSocket.emit (events.js:412:35)
    at endReadableNT (internal/streams/readable.js:1317:12)
    at processTicksAndRejections (internal/process/task_queues.js:82:21)

Scenario: Parallel send two files A and B to the same bucket. Files Size: file A - 541 MB file B - 10.3 GB

    const myFile = bucket.file("folder/my_file.tgz", {});

    const myWriteStream = myFile.createWriteStream({
      chunkSize: 500 * 256 * 1024
    });
danielbankhead commented 2 years ago

Thanks for the additional data - are you experiencing any other networking issues from your application outside of file uploads to GCS? The particular FetchError reason socket hang up occurs when the client's connection to the server has severed.

Given the retry configuration, the connection attempts likely surpassed the retry configuration's threshold - this is evident as chunkSize would attempt to query the server to what bytes are missing, then upload a guaranteed available Buffer.

danielbankhead commented 2 years ago

Additionally, the original connection issue (408) is aligned with connection issues: https://cloud.google.com/storage/docs/json_api/v1/status-codes#408_Request_Timeout

danielbankhead commented 2 years ago

Closing for now due to inactivity.