Open leadscloud opened 9 years ago
that's because i changed your code. but there is an other error to solve.
for yahoo search, if your open the yahoo in Incognito window,search any keyword, then in console run document.getElementsByName("p")
will be return three element
<input type="text" id="vsyc-yschsp" name="p" class="qstext" autocomplete="off" value="">
<input type="text" class="sbq" id="yschsp" name="p" value="sand making machine" autocomplete="off" tabindex="1" autocorrect="off" autocapitalize="off" aria-haspopup="true" style="-webkit-tap-highlight-color: transparent;">
<input type="text" class="sbq" id="yschsp-bot" name="p" value="sand making machine" autocomplete="off">
first element is not visible. so GoogleScraper is always error.
below is my code:
def _wait_until_search_input_field_appears(self, max_wait=5):
"""Waits until the search input field can be located for the current search engine
Args:
max_wait: How long to wait maximally before returning False.
Returns: False if the search input field could not be located within the time
or the handle to the search input field.
"""
def find_visible_search_input(driver):
inputs = driver.find_elements(*self._get_search_input_field())
for input in inputs:
if input.is_displayed():
return input
return False
try:
search_input = WebDriverWait(self.webdriver, max_wait).until(find_visible_search_input)
return search_input
except TimeoutException as e:
logger.error("TimeoutException waiting for search input field: {0}".format(e))
return False
I have some issues when going to the next page in google:
nikolai@nikolai:~/Projects/private/GoogleScraper$ ./run.py -m selenium -s google -q hello -p 5
2015-01-11 15:49:55,772 - GoogleScraper - INFO - 0 cache files found in .scrapecache/
2015-01-11 15:49:55,772 - GoogleScraper - INFO - 0/1 keywords have been cached and are ready to get parsed. 1 remain to get scraped.
2015-01-11 15:49:55,822 - GoogleScraper - INFO - Going to scrape 1 keywords with 1 proxies by using 1 threads.
2015-01-11 15:49:55,825 - GoogleScraper - INFO - [+] SelScrape[localhost][search-type:normal][https://www.google.com/search?] using search engine "google". Num keywords =1, num pages for keyword=5
2015-01-11 15:50:04,929 - GoogleScraper - WARNING - Cannot locate next page element: Message: unknown error: Element is not clickable at point (338, 294). Other element would receive the click: <div id="flyr" class="flyr-o" style="width: 833px; height: 1502px; top: 106px;"></div>
(Session info: chrome=39.0.2171.65)
(Driver info: chromedriver=2.12.301324 (de8ab311bc9374d0ade71f7c167bad61848c7c48),platform=Linux 3.13.0-37-generic x86_64)
For yahoo and bing it works. I've taken your code.
486 is self.search() 418 is self.search_input.clear()