Deepakchawla / Mobile-Phone-Dataset-GSMArena

Python script for creating Mobile Phones Dataset on GSMArena website.
MIT License
59 stars 46 forks source link

List Index out of range #9

Closed salmaaymansaad closed 3 years ago

salmaaymansaad commented 3 years ago

Hi All,

I am new to python if anyone can support ! First on running this code i faced the issue of 'pwd' since i am running on windows not Linux and i followed the solution proposed in another ticket by replacing 'pwd with 'cd'

Then another error came of "list index out of range" as below:

in save_specification_to_file(self) 119 # This function save the devices specification to csv file. 120 def save_specification_to_file(self): --> 121 phone_brand = self.crawl_phone_brands() 122 self.create_folder() 123 files_list = self.check_file_exists() in crawl_phone_brands(self) 43 phones_brands = [] 44 soup = self.crawl_html_page('makers.php3') ---> 45 table = soup.find_all('table')[0] 46 table_a = table.find_all('a') 47 for a in table_a: IndexError: list index out of range Is that because too many request? any idea? can anyone help solve that?
salmaaymansaad commented 3 years ago

And now that my IP is blocked form GSMarea website, how to solve that and unblock my IP again?

Sahil-Ajmera commented 3 years ago

This I believe is due to sending too many requests from your machine against the allowed rate limit for GSMArena. The code is such that it sends too many requests on every run. https://github.com/Deepakchawla/Mobile-Phone-Dataset-GSMArena/blob/master/gsmarena_scraping.py#L30. For the IP blocked issues, comments to the question asked here might help : https://stackoverflow.com/questions/30439092/http-error-429-restricted-python-web-scraping About 429 status code: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429

The original author should know more if there is a way to get around that.