kaskadi / asin-sync

a lambda to synchronize ASINs from Amazon with product database
MIT License
0 stars 0 forks source link

Request number optimization calculation #3

Open alexlemaire opened 4 years ago

alexlemaire commented 4 years ago

Writing down the request number optimization calculation to document the chosen approach

Information: the endpoint used for retrieving the ASIN for a given EAN can accept a list of up to 5 EANs.

Data:


Case 1: using updateQuery

  1. 1 request to ES (updateQuery)
  2. N.X requests to MWS (for each match via updateQuery we inject the ean in a request to each marketplace)

=> total requests: N.X + 1


Case 2: using search combined with bulk on ElasticSearch

  1. 1 request to ES (search)
  2. ceil(N/5).X requests to MWS
  3. 1 request to ES (bulk)

=> total requests: ceil(N/5).X + 2


Comparison:

Let's assume here that X=5 (the number of european marketplaces). If we compare case 1 and case 2 (vertical scale is logarithmic to help readability):

image

image


Conclusion: assuming we would break out of the code if no product has to be updated in case 2, case 1 is favorable if only 1 product needs to be updated in the database. For any other cases case 2 is favorable (less requests).

alexlemaire commented 4 years ago

Note: previously I compared the cases for the number of products without ASIN but that's not really correct... From what I get as response from MWS, it seems that we can have multiple ASIN for a given EAN in some marketplaces. Meaning we are making multiple listing with the same product.

This makes the logic of synchronizing only products without ASIN data outdated because we may create a new listing of an existing product so the synchronization should apply to the whole DB.

Instead of doing a pull-synchronization like now, we should have some kind of hooks to update ASINs in DB when we do a new listing. The data pulling would only occur when doing the initial synchronization. For any new product we should also trigger a lambda that would do the synchronization for that given product.

alexlemaire commented 4 years ago

Probably won't use this repo as an actual lambda!

Keeping the code here for now as I'm still working on it. But most likely this is gonna be a standalone script to do a brute synchronization of the database with Amazon to fetch the ASINs for each products in the database.

We should then have:

alexlemaire commented 4 years ago

After a bit of back and forth on this topic:

After reconsideration, the initial idea of having a unique lambda that does daily syncing would be the easiest approach. The cost for this solution is fine (see below). The only drawback is that with the throttling of the API we are reaching quite quickly the maximum timeout of lambdas... But that could be worked around by splitting the logic into 1 lambda per region.

Cost calculation:

Data:

Calculation: Total compute = 30*900 = 27000 seconds Total memory-compute = 27000 GB-seconds (we use the 1024 <=> 1GB allowed memory) Note: Free tier memory-compute = 400000 GB-seconds so we may not even have to pay Compute cost = 0.0450009 dollar/month (if we exceed the free tier with all lambdas) Free tier requests = 1M/month Requests = 30/month Request cost = 0.00006 dollar/month (if we exceed the free tier with all lambdas)

Conclusion: the cost is neglictible