Open alexlemaire opened 4 years ago
Note: previously I compared the cases for the number of products without ASIN
but that's not really correct... From what I get as response from MWS, it seems that we can have multiple ASIN
for a given EAN
in some marketplaces. Meaning we are making multiple listing with the same product.
This makes the logic of synchronizing only products without ASIN
data outdated because we may create a new listing of an existing product so the synchronization should apply to the whole DB.
Instead of doing a pull-synchronization like now, we should have some kind of hooks to update ASIN
s in DB when we do a new listing. The data pulling would only occur when doing the initial synchronization. For any new product we should also trigger a lambda that would do the synchronization for that given product.
Probably won't use this repo as an actual lambda!
Keeping the code here for now as I'm still working on it. But most likely this is gonna be a standalone script to do a brute synchronization of the database with Amazon to fetch the ASIN
s for each products in the database.
We should then have:
ASIN
for a given EAN
into our DBASIN
assigned to this EAN
After a bit of back and forth on this topic:
elasticsearch
is not that easy. There is no direct event related to elasticsearch
. It should (maybe) work via AWS EventBridge but after some trials I did not manage to trigger anythingASIN
syncing process: MWS API won't return EAN
s for the listed products. This means that for a new listing we have no way to map back to our products in DB (because ASIN
s aren't in sync yet)After reconsideration, the initial idea of having a unique lambda that does daily syncing would be the easiest approach. The cost for this solution is fine (see below). The only drawback is that with the throttling of the API we are reaching quite quickly the maximum timeout of lambdas... But that could be worked around by splitting the logic into 1 lambda per region.
Cost calculation:
Data:
Calculation: Total compute = 30*900 = 27000 seconds Total memory-compute = 27000 GB-seconds (we use the 1024 <=> 1GB allowed memory) Note: Free tier memory-compute = 400000 GB-seconds so we may not even have to pay Compute cost = 0.0450009 dollar/month (if we exceed the free tier with all lambdas) Free tier requests = 1M/month Requests = 30/month Request cost = 0.00006 dollar/month (if we exceed the free tier with all lambdas)
Conclusion: the cost is neglictible
Writing down the request number optimization calculation to document the chosen approach
Information: the endpoint used for retrieving the
ASIN
for a givenEAN
can accept a list of up to 5EAN
s.Data:
N
products withoutASIN
in the databaseX
marketplaces to send requests to onMWS
Case 1: using
updateQuery
updateQuery
)N.X
requests toMWS
(for each match viaupdateQuery
we inject theean
in a request to each marketplace)=> total requests:
N.X + 1
Case 2: using
search
combined withbulk
on ElasticSearchsearch
)ceil(N/5).X
requests toMWS
bulk
)=> total requests:
ceil(N/5).X + 2
Comparison:
Let's assume here that
X=5
(the number of european marketplaces). If we compare case 1 and case 2 (vertical scale is logarithmic to help readability):Conclusion: assuming we would break out of the code if no product has to be updated in case 2, case 1 is favorable if only 1 product needs to be updated in the database. For any other cases case 2 is favorable (less requests).