ablestar / ablestar-cli

A command tool for importing and exporting data from your Shopify store
GNU Affero General Public License v3.0
44 stars 1 forks source link

Export not sleeping on hitting rate limits #6

Open IceEyz opened 1 year ago

IceEyz commented 1 year ago

Thanks for the opportunity, nice of you to develop and share this.

Running a product export with metafields causes the rate limits to be hit. The sleep function should delay further requests, but that's not happening at all, the hammering actually continues, albeit for different productids or variantids. Somehow the current sleep function is not actually awaited to complete before attempting any other REST API call.

Environment is docker node:alpine, same on node:18-alpine.

I can debug further, just not today :).

d-beck commented 1 year ago

Thanks @IceEyz for noticing this, we're taking a look and we'll keep you posted

IceEyz commented 1 year ago

So I took another look, and the current implementation is not blocking the event loop, so other API queries will still get performed. To me it seems you'd need to throttle/delay all REST API requests to all endpoints together, before they get sent out. I tried blocking the event loop by introducing delays in the axios response interceptor, and that helps, but blocking is ugly.

I think you'd need to architect a function that collects all required REST API requests in a single queue, and performs them sequentially, and in a throttled fashion.

tewshi commented 1 year ago

Thank you for your comment @IceEyz . If we use Promise.mapSeries of bluebird, we can handle the REST API requests sequentially. But it will not be so good on performance.

IceEyz commented 1 year ago

TBH, I'm new here in the Shopify API scene, but given the imposed rate limits, what's there to gain by parallelizing requests? You can't go faster than the rate limit right?

d-beck commented 1 year ago

You're right, you can't go faster than the rate limits but in some cases you'd need to parallelize the requests in order to make full use of that limit.

Some API calls can take longer than 500ms to complete, so if you run them sequentially you'll be doing less than the 2 requests/sec rate limit for standard Shopify store.

A store's rate limit can also be higher if it's on a more expensive plan. Right now the standard limit for the REST API is 2 requests/sec but stores on the Shopify Advanced plan have 4 requests/sec and on Shopify Plus it's 20 requests/sec

d-beck commented 1 year ago

Hi @IceEyz, to help prioritize this, is hitting the rate limit causing the export to fail for you or are you still getting the export you need, albeit with a lot of repeated requests?

IceEyz commented 1 year ago

Since all required requests will keep on retrying simultaneously, it'll produce those errors for about N-2 outstanding requests every second (also logging the API creds in clear). I'm not sure what Shopify thinks of that hammering; I'd hate to get my IP blocked.

In any case, I won't be using this api client in the next few weeks due to summer time, and my fixes to stall the event loop reduce the amount of 429's significantly.

So I think it would make sense to improve the functionality by using a single REST API requests queue with a dispatch rate that matches the Shopify rate limit. Not just for me, but for anyone using the client.

IceEyz commented 1 year ago

Actually, I think you'd want to match the leaky bucket on the API with a clientside leaky bucket for requests, like for example these packages can provide;

https://www.npmjs.com/package/@spacejunk/ftl https://www.npmjs.com/package/leaky-bucket https://www.npmjs.com/package/porro

d-beck commented 1 year ago

Makes sense, thanks for the additional info.

Right now Shopify won't block IPs that hit the rate limit too often, they'll just keep getting HTTP 429 replies, but I see how it'd be more efficient to manage the requests centrally in ablestar-cli.

One edge case re:implementing a client-side leaky bucket, there's at least one API call (creating orders on a development store) that has a separate, slower, rate limit. This tool don't make those API calls now but generally we won't be able to assume we'll never receive an HTTP 429 reply if we implement a client-side leaky bucket.

Either way, we can take a look later on, enjoy your summer!