ecoron / SerpScrap

SEO python scraper to extract data from major searchengine result pages. Extract data like url, title, snippet, richsnippet and the type from searchresults for given keywords. Detect Ads or make automated screenshots. You can also fetch text content of urls provided in searchresults or by your own. It's usefull for SEO and business related research tasks.
https://github.com/ecoron/SerpScrap
MIT License
257 stars 61 forks source link

Issue with SerpScrap package on Linux box #40

Closed majila closed 6 years ago

majila commented 6 years ago

Hi SerpScrap Team, I have installed this serpsrap package on my Linux system and its throwing the below error while its working fine on Windows system.

Please let me know, if i am missing any dependent package.

File "/home/ec2-user/Project_name/data_spider.py", line 25, in getData scrap.init(config=config.get(), keywords=keywords) File "/usr/local/lib64/python3.7/site-packages/serpscrap/serpscrap.py", line 98, in init firstrun.download() File "/usr/local/lib64/python3.7/site-packages/serpscrap/chrome_install.py", line 55, in download os.chmod('install_chrome.sh', 755 | stat.S_IEXEC) FileNotFoundError: [Errno 2] No such file or directory: 'install_chrome.sh'

Thanks & Regards, Bhupendra

fjen commented 6 years ago

Here's a workaround. Download Chromedriver [1] and put it in your project (/project/chromedriver/chromedriver).

[1] http://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip

majila commented 6 years ago

Thanks for your quick Reply:

That issue has been fixed and now I am getting the below error:

2018-09-03 10:24:28,336 - root - INFO - Going to scrape 10 keywords with 1 proxies by using 1 threads. 2018-09-03 10:24:28,409 - scrapcore.scraping - INFO - [+] SelScrape[localhost][search-type:normal][https://www.google.com/search?] using search engine "google". Num keywords=1, num pages for keyword=[1]

2018-09-03 10:24:29,422 - scrapcore.scraper.selenium - ERROR - Message: unknown error: cannot find Chrome binary (Driver info: chromedriver=2.41.578700 (2f1ed5f9343c13f73144538f15c00b370eda6706),platform=Linux 4.14.62-70.117.amzn2.x86_64 x86_64)

Exception in thread [google]SelScrape: Traceback (most recent call last): File "/usr/lib64/python3.7/threading.py", line 917, in _bootstrap_inner self.run() File "/usr/local/lib64/python3.7/site-packages/scrapcore/scraper/selenium.py", line 743, in run if not self._get_webdriver(): File "/usr/local/lib64/python3.7/site-packages/scrapcore/scraper/selenium.py", line 241, in _get_webdriver return self._get_Chrome() File "/usr/local/lib64/python3.7/site-packages/scrapcore/scraper/selenium.py", line 294, in _get_Chrome chrome_options=chrome_ops File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 75, in init desired_capabilities=desired_capabilities) File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 156, in init self.start_session(capabilities, browser_profile) File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 251, in start_session response = self.execute(Command.NEW_SESSION, parameters) File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 320, in execute self.error_handler.check_response(response) File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: unknown error: cannot find Chrome binary (Driver info: chromedriver=2.41.578700 (2f1ed5f9343c13f73144538f15c00b370eda6706),platform=Linux 4.14.62-70.117.amzn2.x86_64 x86_64)

Thanks & Regards, Bhupendra

fjen commented 6 years ago

cannot find Chrome binary

Install chrome.

majila commented 6 years ago

Thanks its working Now..Thanks for your support...