Closed hahnicity closed 9 years ago
I've got Travis setup on Pyrr. https://github.com/adamlwgriffiths/Pyrr/blob/master/.travis.yml Can use this config as a base.
What does MaxQPS break? It should prevent breakages due to Amazon throttling. You will end up with HTTP requests returning errors if you exceed it.
I found that if you use the Amazon API to get data, Amazon also apply limits to HTTP requests. I believe they monitor IPs. If you don't use the API, you can hammer their HTTP pages for a loooong time before you get any throttling.
It says MaxQPS cannot be handed down to the AmazonAPI object
153 class AmazonScraper(object):
154 def __init__(self, access_key, secret_key, associate_tag, *args, **kwargs):
155 self.api = AmazonAPI(access_key, secret_key, associate_tag, *args, **kwargs)
Maybe I'm using an incorrect version of the amazon api pkg?
python-amazon-simple-product-api==1.5.0
Also I tracked down the errors to a problem with specific products on the web. When the product comes back from amazon it says they have no ratings. This is fine, I assume some products will not have ratings in general. But then when we are doing product.to_dict()
we are trying to coerce product.ratings
into the dict accidentally because we have modified the __getattr__
method to call self.product
instead of self
. Do we want this functionality. I have a few ways of working around it, none are really pretty. The real fix removing the special __getattr__
handling, but does that break anything in a terrible way?
12 class Product(object):
13 def __init__(self, product):
14 self.product = product
15 self._soup = None
16
17 def __getattr__(self, name):
18 # allow direct access to the product object
19 return getattr(self.product, name)
I'm also having some difficulty with an asin validation for another amazon review. This is the review: http://www.amazon.com/review/R1P0CKXI82A1LT
And this is the main product review page: http://www.amazon.com/product-reviews/B0051QVF7A/ref=cm_cr_pr_top_sort_recent?&sortBy=bySubmissionDateDescending
The asins are mixed up for review and reviews. We're able to get the correct asin for the reviews of
B0051QVF7A
But the single review seems to differ by saying
B007TKK5SG
I think the single review was migrated from the other product into a new products page. Now that we know that amazon does this I'm going to just remove the single review asin test because its not reliable.
Yeah I've come across this. Where reviews are for related products (hard cover, paper back, ebook, etc), Amazon will provide the same reviews. The review ASIN can point to a different parent product media format than what you originally searched for. It's a pain in the ass.
The different formats also do this too (product formats being paper back, hard cover, kindle, etc), quite often one format will have a different list of alternatives than others. It's quite difficult to deal with all the peculiarities.
| It says MaxQPS cannot be handed down to the AmazonAPI object Does this cause an error?
The product api and bottlenose both have a MaxQPS property. https://github.com/yoavaviram/python-amazon-simple-product-api/blob/master/amazon/api.py https://github.com/lionheart/bottlenose/blob/master/bottlenose/api.py
The __getattr__
is to provide seamless wrapping of the product api lib.
Any property / method that doesn't exist in the wrapped class, will be redirected to the product api class.
To avoid getting an existing property from the product api class, add the name to the blacklist in to_dict.
def to_dict(self):
<snip>
# add the python properties
d.update({
k:getattr(self.product, k)
for k in dir(self.product)
if dict_acceptable(self.product, k, blacklist=['browse_nodes', 'api']) # put 'ratings' here
})
Otherwise, each access of the above could be wrapped in a function with a try/except block. This would ensure that no strange products break the class. I think doing it this way is better, we can then leave the blacklist for backend functions only, as it is now.
OK so I used your suggestion with the ratings blacklist and things are passing.
As for MaxQPS python-amazon-simple-product-api is 1.5.0 on Pypi but it is 2.0 on github :-1: The MaxQPS kwarg is not allowed on 1.5.0 but is on 2.0. So you'd have to build the pkg from github to actually get the MaxQPS setting. https://github.com/yoavaviram/python-amazon-simple-product-api/issues/55
I can probably add the github dependency in a requirements.txt file but im not sure how nice that plays with setuptools. I'll do some digging and get some fixes in.
I guess I was using a pip install from git and forgot.
You can provide pip with enough information to handle clashes and versioning.
pip install git+https://github.com/username/repo.git@MyTag#egg=ProjectName
As for the actual setup.py ....not so sure on that one =/
Maybe just comment out the MaxQPS parameters in the tests for now until 2.0 is released.
Do whatever you feel is best / works out of the box.
Edit: Just realised you had already done the requirements.txt. So please ignore that.
I'm going to try to take on fixing up the tests and getting them to work properly. Right now there are basic issues just getting the tests to run and we need to turn off the MaxQPS property in order to get them to go. Also there are three test failures and several errors that can be fixed. Hell maybe I'll even figure out how to throw travis on here; but that can be a separate ticket.