NLPatVCU / PaperScraper

A web scraping tool to systematically extract the text of scientific papers and corresponding metadata from university accessible journals.
GNU General Public License v3.0
189 stars 61 forks source link

WebDriver.__init__() got multiple values for argument 'options' #15

Open MarcoAigner opened 6 months ago

MarcoAigner commented 6 months ago

Would be really interested in using the scraper but unfortunately wasn't able to run it so far.


TypeError Traceback (most recent call last) Cell In[5], line 1 ----> 1 PaperScraper()

File d:\paperscraper\paperscraper\PaperScraper.py:45, in PaperScraper.init(self, webdriver_path) 42 if ('webdriver_path' is not None): 43 self.webdriver_path = webdriver_path ---> 45 self.driver = webdriver.Chrome(webdriver_path, options=options)

TypeError: WebDriver.init() got multiple values for argument 'options'

MoonGirl99 commented 2 weeks ago

Hey I am getting the exact error, Have you found any solution for this issue?

zakidotai commented 2 weeks ago

Hey I am getting the exact error, Have you found any solution for this issue?

Step 1: Download the correct chromedriver version same as your google chrome version using the following link

Step 2: In the script, make these changes:

from selenium.webdriver.chrome.service import Service

# After line 44, use this: 

self.driver = webdriver.Chrome(service=Service(webdriver_path), options=options)

Finally, while building the scraper, use the following command in your code:

scraper = PaperScraper(webdriver_path=absolute_path_to_chromedriver)

Note You may encounter more errors while scraping which are related to the source code of the websites being scraped.
You can raise a new issue reporting the link being scraped and the complete error message.