Open mortenwurd opened 4 years ago
hi @mortenwurd , yes :)
connector = Connector('log_file_refresh.csv',
connector_type = "selenium",
path2selenium = r"C:\Users\Joune\Desktop\chromedriver_win32\chromedriver.exe")
url_trustpilot = 'https://www.trustpilot.com/'
browser = connector.browser
connector.get(url_trustpilot , 'first_call')
#refresh page and store meta data to log file
for i in range(1, 6):
connector.get(browser.current_url, f'refresh_{i}')
pd.read_csv('log_file_refresh.csv', sep=";")
yields output:
Hi @jsr-p
Thanks! :)
Is it possible to get the Connector to run the browser headless? Normally we just type the following code snippet:
options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument('--incognito')
options.add_argument('--headless')
driver = webdriver.Chrome(ChromeDriverManager().install(), chrome_options=options)
Morten
hi @mortenwurd , yes, here is a screenshot of the modified Connector class:
And here is the code ready to be copied and pasted:
if connector_type=='selenium':
assert path2selenium!='', "You need to specify the path to you geckodriver if you want to use Selenium"
from selenium import webdriver
## HIN download the latest geckodriver here: https://github.com/mozilla/geckodriver/releases
assert os.path.isfile(path2selenium),'You need to insert a valid path2selenium the path to your geckodriver. You can download the latest geckodriver here: https://github.com/mozilla/geckodriver/releases'
##################
#headless options#
##################
options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument('--incognito')
options.add_argument('--headless')
#insert options parameter when making the object
self.browser = webdriver.Chrome(executable_path=path2selenium,
options = options) # start the browser with a path to the geckodriver.
My group is scraping a website for data every 60 seconds. We use Selenium and the driver.refresh() command inside a while-loop to update the webpage. It works fine however it doesn't add an entry into the log for every refresh. Is it possible to get a log file like the one from connector.get when using Selenium and driver.refresh()?
Thanks.
Morten