googleapis / google-api-nodejs-client

Google's officially supported Node.js client library for accessing Google APIs. Support for authorization and authentication with OAuth 2.0, API Keys and JWT (Service Tokens) is included.
https://googleapis.dev/nodejs/googleapis/latest/
Apache License 2.0
11.42k stars 1.92k forks source link

Google Drive: Incorrect Forbidden 403 #2968

Open mjbartho opened 2 years ago

mjbartho commented 2 years ago

We are migrating our video catalog to a new processing and hosting solution and as a part of that, we need to transfer files that are currently housed in GoogleDrive. After around 60 videos transferred (our videos are typically between 100MB and 300MB), we start to see errors when trying to transfer the files which give a 403 error code with statusText "Forbidden".

We are using the drive.files.get api to stream the file into S3

  drive.files
    .get(
      {
        fileId: driveId,
        alt: 'media',
        auth: driveApiKey,
      },
      { responseType: 'stream' },
    )
    .then(response =>
      s3
        .upload({
          Bucket: sourceBucket,
          Key: `${jobId}/${fileName}`,
          Body: response.data,
        })
        .promise(),
    )

I was retrying the file transfers to debug the issue, but the issue persisted. Once the hour turned 6:00 PM today, I happened to retry a few more and they succeeded.

This hints at some hourly egress limit. I was unable to find any documentation for this limit and this was not covered by the cases described in

https://developers.google.com/drive/api/guides/handle-errors

This is not strictly a request rate limiting issue so perhaps a 429 would not be appropriate, but a more explicit statusText would at least allow for more reasonable debugging.

Environment details

Steps to reproduce

  1. Programmatically transfer a number video files (seemingly around 10GB) to another location (S3 in my case) within one hour.
  2. Observe 403 error with message "Forbidden" like

    GaxiosError: [object Object]
    at Gaxios.<anonymous> (/opt/nodejs/node_modules/gaxios/build/src/gaxios.js:73:27)
    at Generator.next (<anonymous>)
    at fulfilled (/opt/nodejs/node_modules/gaxios/build/src/gaxios.js:16:58)
    at processTicksAndRejections (internal/process/task_queues.js:95:5) {
    response: {
    config: {
    url: 'https://www.googleapis.com/drive/v3/files/<redacted>?alt=media&key=<redacted>',
    method: 'GET',
    responseType: 'stream',
    paramsSerializer: [Function (anonymous)],
    headers: [Object],
    params: [Object: null prototype],
    validateStatus: [Function (anonymous)]
    },
    data: PassThrough {
    _readableState: [ReadableState],
    _events: [Object: null prototype],
    _eventsCount: 5,
    _maxListeners: undefined,
    _writableState: [WritableState],
    allowHalfOpen: true,
    },
    headers: {
    'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"',
    connection: 'close',
    'content-length': '1103',
    'content-type': 'text/html; charset=UTF-8',
    date: 'Wed, 04 May 2022 00:56:53 GMT'
    },
    status: 403,
    statusText: 'Forbidden'
    },
    config: {
    url: 'https://www.googleapis.com/drive/v3/files/<redacted>?alt=media&key=<redacted>',
    method: 'GET',
    responseType: 'stream',
    paramsSerializer: [Function (anonymous)],
    headers: {
    'Accept-Encoding': 'gzip',
    'User-Agent': 'google-api-nodejs-client/0.7.2 (gzip)'
    },
    params: [Object: null prototype] {
    alt: 'media',
    key: '<redacted>'
    },
    validateStatus: [Function (anonymous)]
    },
    code: '403'
    }
YoussefDemnati commented 1 year ago

have you solved this, because i'm stuck at it and nothing works?

mjbartho commented 1 year ago

No, I didn't solve this. I was processing each of these videos in a managed workflow execution (AWS StepFunctions) and set up a wait step to wait an hour and then try again. I would try this loop for 1 week before failing the workflow. I would have preferred to trigger this wait step with an explicit signal (429 or a more accurate statusText), but I ended up casting the wide net and going into this wait cycle in response to 403 errors.