ericblade / mws-advanced

Javascript (node) Amazon Merchant Web Services (MWS) code with mws-simple as it's underpinnings
Apache License 2.0
41 stars 11 forks source link

Errors after updating MWS Advanced #111

Closed pwrnrd closed 4 years ago

pwrnrd commented 5 years ago

After updating MWS Advanced and MWS Simple I run into errors:

Error 1

TypeError: Cannot read property 'Order' of undefined at Order (./list-orders.js:40:52)

Error 2

Error: Cannot read property 'ReportRequestId' of undefined

My code requests a report using an MWSAdvance class instance.

import { MWSAdvanced } from "mws-advanced";
const mws = new MWSAdvanced(mwsKeys);
const requestedReport = await mws.requestReport(reportOptions);
console.log(requestResult.ReportRequestId) // <-- this causes the error

I updated MWS Advanced from the issue 101 branch and I think I updated MWS Simple from version 3.0.0.

Is there something I missed when updating? E.g. name changes?

ericblade commented 5 years ago

should have mws-simple 4. it returns (err, { results, headers }) instead of (err, results, headers), and also returns a Promise if no callback was given.

since you're using requestReport() which goes through callEndpoint and friends, i wouldn't expect to see an error there, unless the mws-simple didn't update?

ericblade commented 5 years ago

you've got me worried that i messed something up but i'm not seeing anything after reviewing. I've been working on a report viewing functionality, so I'm really sure that requestReport isn't broken.

pwrnrd commented 5 years ago

Okay! Thank you for your reply! I rolled-back the update that I did. I'll retry updating the packages.

pwrnrd commented 5 years ago

I think I made a mistake when updating the packages. After rolling back the update and re-updating, the package seems to work, although I keep running into throttling issues when requesting/downloading reports. Therefore, I decided to rewrite my implementation and reduce the number of MWS instances that I create.

ericblade commented 5 years ago

... but you weren't running into throttling problems before?

pwrnrd commented 5 years ago

No I wasn't, but I implemented my own version of requestAndDownloadReport() where I scaled the timeouts by the number of instances I expected. This wasn't a neat way of dealing with throttling.

ericblade commented 5 years ago

I'm sorry I broke your code. I'm hoping to get my app out this year (earlier rather than later if possible :) ), which will also probably entail sticking this library to using semantic versioning properly.. i just didn't want to end up with it on like version 12.2.8 or something while it went through a lot of interface changes. :-D Probably not a good excuse, but when people see high version numbers, they tend to expect things that are highly stable, and that's definitely not where we've been the past year.

So, overall, is this new throttle handling an improvement on your end, or is it more of a problem to overcome?

I'm mostly working in the Products API, with recent work in Reports just as a proof of concept sort of thing, so your input in areas that I'm not using is definitely appreciated.

pwrnrd commented 5 years ago

So, overall, is this new throttle handling an improvement on your end, or is it more of a problem to overcome?

It took me a lot of time to write the implementation that I wrote, so I still need some time to refactor my code base to implement this change (write new tests, etc.). Yesterday I tested the first part that I refactored, and that seems to work very well. In theory, queueing is way more efficient, so it should speed up my ETL processes significantly.

Don't worry about the change. That's a problem we all have when we open up an API to the outside world. People will always use it in ways, we didn't predict. So the only thing we can do is write APIs such that changes affect end-users to a minimum, and I think that his major refactor allows the package to become more stable (i.e. affect the end-user by future updates less than past updates). So overall it's a good change! 😃

Btw, if your looking at reporting, what might be very nice to implement is this: https://docs.developer.amazonservices.com/en_US/notifications/Notifications_ReportProcessingFinishedNotification.html . In stead of checking the MWS API every so often to see whether reports are done, we could retrieve a push notification from the API, which will inform us when the report is finished. This might work really well with requestAndDownloadReport(). I'm not familiar with this part of the API though.

pwrnrd commented 5 years ago

Ok. I rewrote 90% of my implementation. My implementation is able to fetch some reports, but for approximately 60% of the requested reports MWS Simple returns the following error:

Error: Service Unavailable
  at Request.postRequest [as _callback] (/makeRequest.js:31:14)

Within my implementation I set-up an MWS-Advanced object once. Subsequently, I create requests and fire these requests per type of report. E.g. listings reports are all fired at once using the MWS-Advanced instance.

Could this be related to #109 or is it likely that the culprit is somewhere else?

pwrnrd commented 5 years ago

I've been trying to debug this error. Even if I divide the restore times by 5, or multiply the timeout in requestAndDownloadReport() by 5 (from 45 seconds to 225 seconds), I still retrieve reports with the same "Service Unavailable"-error.

ericblade commented 5 years ago

I really haven't done any hammering on requestAndDownloadReport, and haven't really looked at it in quite a while . . i do use it, but just on a single item at a time. It doesn't sound like #109 that's for things like in Products API where Get and GetByNextToken share the same call quota. RequestReport and GetReport shouldn't be like that.

Not to discount that there could be a problem here that I'm not hitting, I did notice while I was implementing the request quota stuff recently, that the documented quotas don't seem to be particularly accurate, in that on a couple of calls that i was testing i was able to go over the documented quota or the documented restore time was not exactly correct seeming.

Any chance you could share what's failing? I see github did just start allowing free users to create private repos, if you don't want to share publicly. I'd like to dig into a use case that isn't mine, and see what's breaking.

pwrnrd commented 5 years ago

What I think that was failing was that I kept hitting the maximum quotas. If you request a report, one at a time, then those reqeuests don't take into account previous requests being send to the same endpoint. So I think I'm hitting the max quotas. I think the following encompasses what I'm doing:

const reportRequests = createRequests();
reportRequests.forEach(async request => {
    const { ReportType, MarketplaceIdList } = reportRequest;
    const report =  await MWS.requestAndDownloadReport(ReportType, null, reportRequest);
    const result = processReport(report, MarketplaceIdList);
    return result;
})
ericblade commented 5 years ago

hey @pwrnrd are you still seeing things going wrong here? it has been literally a couple of months since i've seen a throttle request in the logs for my app, but my app is only used by 6 people, and we're not putting in tons of requests outside of the product lookup apis.

pwrnrd commented 5 years ago

Throtteling doesn't work because the queues are deleted when the requeust are performed. However, queues should not be deleted once they are empty, queues should be deleted when their quotas are fully restored.

ericblade commented 5 years ago

That's specifically on report generation, or is it something that you're seeing on other calls as well? One of my users has an absolutely massive amount of reports that does make things rather difficult to work with due to the length of time that it takes to generate a report list even for a single day...

pwrnrd commented 5 years ago

Well you hit this issue especially with reports because we need to poll Amazon for reports. But this is an issue with every queue. The queues are a good step in the right directions, but they must only be deleted once the rate limits are restored to their default values.

If we delete the queue only when the rate limits are back to the default values, requesting reports should be very easy! Imagine your clients wants to request 10 reports at the same time. Currently, he will hit request limits once he starts polling MWS, because on every poll his rate limit decreases, but we delete the queue after the request has been made. Hence, when your client creates his next poll, after 'x' seconds, that new poll request thinks that the rate limits are restored even though they aren't. Hence, the request will be made and you'll receive a rate limit error. If, on the otherhand, our next request knew that our rate limits were not fully restored (by making use of the previous queue), you could safely make those calls. You could simple request 10 / 20 / ... / x reports simultaneously.

ericblade commented 5 years ago

There are many facets to the problem of quotas, and it has made trying to build a rate limiter that is truly reliable somewhat difficult .. though I still really haven't put in anything that uses the header quota information..

First, there are two different classifications of quota -- the max request quota, and the hourly quota. The hourly quota is only used for Products/Reports/Feeds .. and is also the only one that returns headers. However, if you hit that quota -- you get no header.

Secondly, and this is a really annoying part, is that because there's no headers returned outside of Products/Reports/Feeds, and there's no header at all if you're already over quota, it's not possible for an application to find out where it is in the quota, if it's just started making requests . . or if there's another application that might be making requests against the same quota. (such as, multiple instances of my app, or another user who has multiple applications that they use, all hitting the same request quotas)

So, simply having an in memory queue object still can't really solve the problem 100%, so we can only even potentially solve 100% of the time if we are the only client of the api, and we started with all the rate-limited requests at zero.

Anyway, what I have tried to handle so far, is the max number of requests in flight quotas, so that, in theory, if there wasn't an hourly quota in the listed categories, it should be pretty smooth, again, as long as you're the only client hitting that API with those user credentials. If I've got a user that has given permission to some accounting package on the 'net that pulls a mass amount of reports every couple of hours, I don't really have any way of knowing in advance if I'm going to blow that quota in my application, until I start receiving the bad request messages.

Probably what we could do, would be to monitor both x-mws-quota-remaining and x-mws-quota-resetsOn -- if x-mws-quota-remaining reaches 0, just absolutely halt that queue from running. when time reachers x-mws-quota-resetsOn, then resume that queue. That should prevent it from continuing to attempt to break quota when it's just not ever going to work.

This does present another question, too -- say someone breaks their hourly quota in 20 minutes -- how do we handle notifying the consumer "this request can't possibly go through for another 40 minutes", and should we offer a way to cancel that request?

pwrnrd commented 5 years ago

Secondly, and this is a really annoying part, is that because there's no headers returned outside of Products/Reports/Feeds, and there's no header at all if you're already over quota, it's not possible for an application to find out where it is in the quota, if it's just started making requests

If I undstand you correctly, we do not get headers under two circumstances:

  1. You're requesting an API endpoint with an hourly quota and you exceeded your quota.
  2. You're not requesting an API endpoint with an hourly quota (these endpoints never return headers)

Is this correct?

... or another user who has multiple applications that they use, all hitting the same request quotas)

I thought that the quotas are based on your developer API key pair combined with the seller ID and auth token. Hence, when multiple developers with their own keys are making request on behalf of the same seller, their quotas should not interfere with one another.

So, simply having an in memory queue object still can't really solve the problem 100%, so we can only even potentially solve 100% of the time if we are the only client of the api, and we started with all the rate-limited requests at zero.

You're right, if you're creating multiple instances of the MWS Advanced object then a local queue can't help you. Hence, setting up multiple instances of MWS Advanced is an antipattern. We could use something like the singleton pattern. Ofcourse, then you'd still run into trouble if you're using microservices and different services use MWS Advanced. However, I think that if you want to request the same endpoint from different services, you're microservice design is probably incorrect, because each microservice should have it's own responsbility (much along the lines of DDD). Therefore, in my view, having the queue manage the quotas would make sense for almost all users (ok, ofcourse there might be some edge cases which we would not be able to cover). What do you think?

Anyway, what I have tried to handle so far, is the max number of requests in flight quotas, so that, in theory, if there wasn't an hourly quota in the listed categories, it should be pretty smooth, again, as long as you're the only client hitting that API with those user credentials. If I've got a user that has given permission to some accounting package on the 'net that pulls a mass amount of reports every couple of hours, I don't really have any way of knowing in advance if I'm going to blow that quota in my application, until I start receiving the bad request messages.

Probably what we could do, would be to monitor both x-mws-quota-remaining and x-mws-quota-resetsOn -- if x-mws-quota-remaining reaches 0, just absolutely halt that queue from running. when time reachers x-mws-quota-resetsOn, then resume that queue. That should prevent it from continuing to attempt to break quota when it's just not ever going to work.

Do we always receive these datapoints from MWS? If so, that could work!

This does present another question, too -- say someone breaks their hourly quota in 20 minutes -- how do we handle notifying the consumer "this request can't possibly go through for another 40 minutes", and should we offer a way to cancel that request?

Definitely, by default we timeout and then we can implement something like this:

async _handleMWSRequestOrTimeout(handleMWSRequest: () => Promise<void>, timeout: number): Promise<void> {
    let timer: NodeJS.Timeout;
    const maxPendingTimeout = new Promise((_, reject) => {
        timer = setTimeout(() => reject(new Error("The MWS Request was not completed, because the request was pending for too long")), timeout)
    });

    Promise.race([maxPendingTimeout, handleMWSRequest()]).finally(() => clearTimeout(timer));
}

I passed timeout as a parameter here but we could also read it from the options passed to MWS.

ericblade commented 5 years ago

If I undstand you correctly, we do not get headers under two circumstances:

You're requesting an API endpoint with an hourly quota and you exceeded your quota. You're not requesting an API endpoint with an hourly quota (these endpoints never return headers) Is this correct?

Seems correct. You get a header when you get a successful response from an API that has an hourly quota. Otherwise, as far as I'm aware, there are no quota headers received when you either get a Throttled response or when you are requesting something that has no hourly quota.

I thought that the quotas are based on your developer API key pair combined with the seller ID and auth token. Hence, when multiple developers with their own keys are making request on behalf of the same seller, their quotas should not interfere with one another.

... that's actually something that i'm not positive of, but I hope you're right -- I was thinking that it was specific to the seller ID not to the Dev key and seller ID. So, if that's the case, that eliminates a large part of concern.

antipattern stuff

You're right, but it's also something that at least needs to be considered -- if you're using something that isn't a long-running service, say you implement something that uses mws-advanced, and it fires off 100 requests, gets the responses, then shuts down. You do the same operation again, immediately after, and maybe now you're over quota, despite only sending the first few requests.

Even something as relatively simple as a script to "download all reports from day X", if you run that a few times in a row, you're going to run into problems, so we can't just straight up handle it with "read the headers". Not that you're advocating that, I'm just trying to explain the details as I'm aware of them. Someone's going to use it in a way that gets them into a throttle situation where we have no information about how to continue other than "wait the minimum (or maximum) amount of time, and try again" (or just fail)

Do we always receive these datapoints from MWS? If so, that could work! [re: x-mws-quota-remaining and resetsOn]

For the hourly quotas, yes. x-mws-quota-remaining should be given as a header with an integer with the number of requests remaining for the hour, and x-mws-quota-resetsOn should be a ISO date string that indicates the time at which the hour is up. What I'm not entirely certain of, because I haven't really examined these headers yet, is if that hour "slides", or if it's consistent from the first recent use of a call. like, if I send 1 request now, and then 45 minutes later, I send 50 requests, will it count 51 requests in an hour that started back when i sent the first one, or will it count starting from when i started sending the large number? I'm not sure if that matters, really, but I think about it.

I remember that at some point I started doing the plumbing to send the headers into somewhere the queues could read them, but I don't recall completing that. :-S