NikolaiT / GoogleScraper

A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
https://scrapeulous.com/
Apache License 2.0
2.6k stars 734 forks source link

Issue with proxy file #208

Open lorenzoromani1983 opened 6 years ago

lorenzoromani1983 commented 6 years ago

Hi. I am not able to run the proxy file. I have it formatted this way:

Socks4 182.48.90.81:1080 Socks4 36.37.225.50:33012

it is a simple txt file with many rows i get this error message:

Invalid proxy file. Should have the following format: {}'.format(parse_proxy_file.doc)) Exception: Invalid proxy file. Should have the following format: Parses a proxy file

please, help :)

fassn commented 6 years ago

This project hasn't been updated for more than a year. Prefer using this project, developed from this one: https://github.com/fassn/SerpScrap (this is just a fork, I didn't started the project)

lorenzoromani1983 commented 6 years ago

thanks. but that, as far as i understand, needs to be used within code. this is a "stand-alone" tool...which is easier for me (not a coder, not an expert at all). did you manage to make the proxies work in http mode? i need to get a json/csv of 700/800 kw from google.

fassn commented 6 years ago

Yes, I use proxies on the SerpScrap projet without problems

lorenzoromani1983 commented 6 years ago

THanks, it looks neat. However, i ran into two main problems:

1) I cant save to CSV: how to proceed? this is the error i get:

Traceback (most recent call last): File "C:\Users\Lorenzo\Anaconda\lib\site-packages\serpscrap\csv_writer.py", line 10, in write with open(file_name, 'w', encoding='utf-8', newline='') as f: PermissionError: [Errno 13] Permission denied: 'c:/.csv' None Traceback (most recent call last): File "C:\Users\Lorenzo\Anaconda\lib\site-packages\serpscrap\csv_writer.py", line 10, in write with open(file_name, 'w', encoding='utf-8', newline='') as f: PermissionError: [Errno 13] Permission denied: 'c:/.csv'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\Lorenzo\Desktop\scraper.txt", line 17, in results = scrap.as_csv('c:/') File "C:\Users\Lorenzo\Anaconda\lib\site-packages\serpscrap\serpscrap.py", line 134, in as_csv writer.write(file_path + '.csv', self.results) File "C:\Users\Lorenzo\Anaconda\lib\site-packages\serpscrap\csv_writer.py", line 17, in write raise Exception Exception

2) when I try to scrape many kw with proxies, I get this error, and end up banned by Google

2018-04-28 14:00:13,048 - scrapcore.scraper.selenium - WARNING - 'NoneType' object has no attribute 'group' I am using proxies from free online proxy lists. Maybe they are already blacklisted by google?

ecoron commented 6 years ago

hi,