googleapis / google-api-nodejs-client

Google's officially supported Node.js client library for accessing Google APIs. Support for authorization and authentication with OAuth 2.0, API Keys and JWT (Service Tokens) is included.
https://googleapis.dev/nodejs/googleapis/latest/
Apache License 2.0
11.34k stars 1.92k forks source link

Implement connection queueing to avoid nodejs connection limits #513

Closed yannbertrand closed 5 years ago

yannbertrand commented 8 years ago

I sometimes get this error after doing some requests:

{ [Error: getaddrinfo ENOTFOUND www.googleapis.com www.googleapis.com:443]
  code: 'ENOTFOUND',
  errno: 'ENOTFOUND',
  syscall: 'getaddrinfo',
  hostname: 'www.googleapis.com',
  host: 'www.googleapis.com',
  port: 443 }

Here is the last request I'm recording:

{ url: 'https://www.googleapis.com/youtube/v3/channels',
  method: 'GET',
  json: true,
  qs: { part: 'id, contentDetails', id: 'UChu4DxadETMAMlgwBREcNvQ' },
  useQuerystring: true,
  headers: { Authorization: 'Bearer mytoken' } }
yannbertrand commented 8 years ago

Some Ideas

At first I had an error with the channel https://www.youtube.com/channel/UChu4DxadETMAMlgwBREcNvQ. I thought it may come from some special chars issues cause of the channel title 便利ライフハック 【venlee lifehack】.

But then I had an issue with https://www.youtube.com/channel/UCPK5G4jeoVEbUp5crKJl6CQ ZedaphPlays.

I've also tried to repeat one request in a loop (> 1000 requests) and it's seems to crash totally randomly...

jmdobry commented 8 years ago

Hi, can you please post some sample code that we can use to reproduce the issue that you're seeing?

Thanks.

yannbertrand commented 8 years ago

You may reproduce my error with this code:

  async.auto({

    getSubscriptions: function (next) {
      var subscriptions = [];
      var nextPageToken = true;
      var i = 1;

      async.whilst(
        () => nextPageToken,
        function (nextPage) {
          google.youtube('v3').subscriptions.list({
            part: 'id, snippet',
            mine: true,
            maxResults: 50,
            order: 'alphabetical',
            pageToken: nextPageToken || null,
            auth: oauth2Client
          }, function (err, aSubscriptionsPage) {
            if (err) {
              console.log('Error when trying to find a subscription page');
              return nextPage(); // In fact retrying the same page
            }

            nextPageToken = aSubscriptionsPage.nextPageToken || false;
            subscriptions = subscriptions.concat(aSubscriptionsPage.items);
            nextPage(null, subscriptions);
          });
        },
        function (err, allSubscriptions) {
            if (err) return next(err);
            next(null, allSubscriptions);
        }
      );
    },

    getChannelDetails: ['getSubscriptions', function (next, results) {
      var channelsDetails = [];

      async.each(results['getSubscriptions'], function (subscription, nextSubscription) {
        google.youtube('v3').channels.list({
          part: 'id, contentDetails',
          id: subscription.snippet.resourceId.channelId,
          auth: oauth2Client
        }, function (err, channelDetails) {
          if (err) {
            console.log('Error when trying to get channel ' + subscription.snippet.resourceId.channelId, err);
            return nextSubscription();
          }

          if (channelDetails)
            channelsDetails.push(channelDetails);

          nextSubscription();
        });
      }, function (err) {
        next(err, channelsDetails);
      });
    }],

    getLastUploadedVideos: ['getChannelDetails', function (next, results) {
      var lastUploadedVideos = [];

      async.each(results['getChannelDetails'], function (channel, nextChannel) {
        if (channel.pageInfo.totalResults <= 0) return nextChannel();

        google.youtube('v3').playlistItems.list({
          part: 'id, snippet, contentDetails',
          playlistId: channel.items[0].contentDetails.relatedPlaylists.uploads,
          auth: oauth2Client
        }, function (err, lastUploadedVideosFromChannel) {
          if (err) {
            console.log('Error when trying to find playlist ' + channel.items[0].contentDetails.relatedPlaylists.uploads + ' from channel ' + channel.items[0].id, err);
            return nextChannel();
          }

          lastUploadedVideosFromChannel = lastUploadedVideosFromChannel.items.map(uploadedVideoFromChannel => {
            return {
              id: uploadedVideoFromChannel.contentDetails.videoId,
              thumbnail: uploadedVideoFromChannel.snippet.thumbnails.high.url,
              title: uploadedVideoFromChannel.snippet.title,
              channel: uploadedVideoFromChannel.snippet.channelTitle,
              publishedAt: new Date(uploadedVideoFromChannel.snippet.publishedAt)
            };
          });

          lastUploadedVideos = lastUploadedVideos.concat(lastUploadedVideosFromChannel);
          nextChannel();
        });
      }, function (err) {
        next(err, lastUploadedVideos);
      });
    }],

    orderLastUploadedVideos: ['getLastUploadedVideos', function (next, results) {
      next(null, results['getLastUploadedVideos'].sort((firstVideo, secondVideo) => {
        return secondVideo.publishedAt.getTime() - firstVideo.publishedAt.getTime();
      }));
    }]

  }, function (err, results) {
    if (err) return cb(err);

    cb(null, results['orderLastUploadedVideos']);
  });
jmdobry commented 8 years ago

getaddrinfo ENOTFOUND sounds like a connectivity, or that it's intermittently failing to resolve the dns. I'll have to give your code a try.

yannbertrand commented 8 years ago

I think so too. I'm doing all the requests in parallel, could it be the problem?

The code is meant to retrieve the 5 latest videos of the user subscriptions.

On 04 Mar 2016, at 08:39, Jason Dobry notifications@github.com wrote:

getaddrinfo ENOTFOUND sounds like a connectivity, or that it's intermittently failing to resolve the dns. I'll have to give your code a try.

— Reply to this email directly or view it on GitHub.

blaudev commented 7 years ago

Sounds like you have exceeded the quota

wellczech commented 7 years ago

I face same issue, but don't think it's related to google-apis-nodejs-client.

It happens when I make many requests in parallel. Error should be caught and request repeated. It's not related to exceeded API quotas; that returns different error.

Most likely OS limitations, on Mac/Linux try changing OS settings "ulimit -n" https://github.com/nodejs/node-v0.x-archive/issues/5545

alianrock commented 7 years ago

I got the some issue, when i make 200 requests in the same time.And Every request goes to make 10 requests to a api in parallel to get some data. It seem's not the api error. here is the err

"message":"getaddrinfo ENOTFOUND api.github.com api.github.com:443","name":"Error","stack":"getaddrinfo ENOTFOUND api.github.com api.github.com:443\
ace-n commented 7 years ago

@alianrock is that an error with this library, or something else? I'm not sure this issue involves the Github API...

wellczech commented 7 years ago

This is not library issue. On response, I check for err.code. I reply request on 'ENOTFOUND', 'ETIMEDOUT', 'ECONNRESET', 500, 403 and 429. Preferably, reply with exponential backoff.

iddogino commented 7 years ago

Similar issue here. I tried very aggressive throttling (up to 10 requests per second), which gets it to work for a little bit, but it still crashes eventually...

wellczech commented 7 years ago

@iddogino when facing this error, just replay the request. It has nothing to do with the library. I don't know exactly when this happens, but it might be DNS related. Most likely OS will not provide response in time. It seems to be related to DNS record TTL and refreshing local DNS cache as in my experience I see this error code to appear in bulk (making 70requests/second, then several request fails at one moment but resending and continuing as usual works. And afterwards for long time everything is ok, until several requests fail again). I'll do some tests to see, if it really corresponds to DNS cache refresh. But still, not this library problem, it's either node.js or OS, but anyway, it seems normal and there's nothing like proper error checking and reacting adequately.

Does resending request fix the issue for you?

beenotung commented 6 years ago

It is possible that you're under unstable network. I face same issue when using school wifi, and my system auto retry connection so it's hard to notice this factor.

JustinBeckwith commented 6 years ago

@google/google-node-team @jonparrott this one is tricky and I'm interested in your input. I've made this mistake before, where you get a list of things back, and make (n) outgoing API calls all at the same time. With 5 elements in the array, it works! With 5000.... you end up with errors like the folks have above.

I wouldn't say this is a bug with the library per-say, but it is something we could help users with. I can see a few options:

Anyone have experience with something like this? Thoughts?

ofrobots commented 6 years ago

+1 on using http.Agent.

JustinBeckwith commented 6 years ago

So. Now the hard question. What's a "reasonable default"?

ofrobots commented 6 years ago

as long as it can be configured... 13?

JustinBeckwith commented 6 years ago

Only concern here is that this could be a breaking change. Alternative approach - I use an http Agent, set the maxSockets to infinity, and let users provide an override. Document it.

Eh?

ofrobots commented 6 years ago

Isn't that the default? Or are we disabling the global agent somehow? AFAIK, by default http requests use http.globalAgent which defaults to Infinity for maxSockets.

stefek99 commented 6 years ago

After reading all the comments - it's not hitting the quota - it's too many parallel requests?


I've been making a lot of requests and it's not surprising I'm being limited.

I'm just confused the error is not the same as in the docs: https://developers.google.com/gmail/api/v1/reference/quota

image


Question on StackOverflow

https://stackoverflow.com/questions/50049486/node-js-error-getaddrinfo-enotfound-errnoexception-getaddrinforeqwrap-when-maki

Will try implementing request queue as suggested by @JustinBeckwith: https://github.com/google/google-api-nodejs-client/issues/513#issuecomment-381474846

beenotung commented 6 years ago

I end up using 'queue' to avoid concurrent request, and auto retry for three times. (I'm using fetch on maps.googleapis.com directly, not using this node.js library)

export class IntervalPool {
    currentTask: Promise<any> = Promise.resolve();

    constructor(public interval: number) {
    }

    async queue<A>(f: () => A): Promise<A> {
        return this.currentTask = this.currentTask.then(() => new Promise<A>((resolve, reject) => {
            setTimeout(() => {
                try {
                    resolve(f());
                } catch (e) {
                    reject(e);
                }
            }, this.interval);
        }));
    }
}

const fetchPool = new IntervalPool(1000 / 49);
let n_fetching = 0;
let n_fetched = 0;
const report = () => console.error({fetch: n_fetched + '/' + n_fetching});
const fetchCounted = url => {
    n_fetching++;
    report();
    return fetch(url)
        .then(res => {
            n_fetched++;
            report();
            return res;
        })
        .catch(e => {
            n_fetched++;
            report();
            return Promise.reject(e);
        });
};
const fetchRetry = (url, num, e?) => {
    if (num >= 0) {
        return fetchPool.queue(() => fetchCounted(url))
        // return fetchCounted(url)
            .catch(e => {
                // console.error({retry: {url, num, e}});
                return later(() => fetchRetry(url, num - 1, e), 1000 / 49);
            });
    } else {
        return Promise.reject(e);
    }
};

function realBusiness (name, key) {
       let url = `https://maps.googleapis.com/maps/api/geocode/json?&address=${encodeURI(name)}&key=${key}`;
      { ... }
       return fetchRetry(url, 3);
}
brownbl1 commented 6 years ago

Is there an established workaround that works consistently? I'm running into this as well and am trying the retry approach, but I'm concerned that that won't scale well. I'm doing large file uploads, and only allowing two at a time. But once the upload progress hits 100%, it still takes a while to finish processing on the server, so I allow the queue continue and upload the next video. Maybe the right way is a combination of retry and limiting maxSockets?

ianldgs commented 5 years ago

I know it's been a long time since this issue got created, but I have faced a problem just like this when making lots of raw requests. I looks like some limitation on nodejs. Or it can be some protection on the dns server to avoid DDOS atacks. What I ended up doing was a dns cache. First resolve the dns of the domain you are requesting and cache it (probably there is a way to check the dns record's ttl). Then, make the request to the ip, instead of making to the domain. Some websites won't allow direct ip access, so you have to make sure the Host header is set to the domain you want to request.

JustinBeckwith commented 5 years ago

Greetings folks! I'd love to do something here, but with hundreds of API endpoints, it's hard to come up with a unified queuing strategy that's going to work for everyone. There's no magic concurrency that's going to be right for anyone, and changing the current behavior would be a breaking change for many.

Instead - I am going to recommend that y'all take a look at p-queue: https://github.com/sindresorhus/p-queue

We use this internally for a lot of our modules, and we've been really happy with it. Using this with googleapis should be pretty easy :)

Let us know what you think!