Closed schroeder-g closed 9 months ago
Hi there! Can you share the code?
Hi spinlud,
Thanks for getting back to me! Here's my script, taken almost directly from the example.py, with changes only to the options.locations, .limit, and the filters:
import logging
from linkedin_jobs_scraper import LinkedinScraper
from linkedin_jobs_scraper.events import Events, EventData
from linkedin_jobs_scraper.query import Query, QueryOptions, QueryFilters
from linkedin_jobs_scraper.filters import RelevanceFilters, TimeFilters, TypeFilters, ExperienceLevelFilters
# Change root logger level (default is WARN)
logging.basicConfig(level=logging.INFO)
def on_data(data: EventData):
print('[ON_DATA]', data.title, data.company, data.date, data.link, len(data.description))
def on_error(error):
print('[ON_ERROR]', error)
def on_end():
print('[ON_END]')
scraper = LinkedinScraper(
chrome_executable_path=None, # Custom Chrome executable path (e.g. /foo/bar/bin/chromedriver)
chrome_options=None, # Custom Chrome options here
headless=True, # Overrides headless mode only if chrome_options is None
max_workers=1, # How many threads will be spawned to run queries concurrently (one Chrome driver for each thread)
slow_mo=0.4, # Slow down the scraper to avoid 'Too many requests (429)' errors
)
# Add event listeners
scraper.on(Events.DATA, on_data)
scraper.on(Events.ERROR, on_error)
scraper.on(Events.END, on_end)
queries = [
Query(
query='Engineer',
options=QueryOptions(
locations=['New York', 'Philadelphia', 'Seattle', 'Portland', 'Eugene', 'Boulder'],
optimize=True,
limit=15,
filters=QueryFilters(
# company_jobs_url='https://www.linkedin.com/jobs/search/?f_C=1441%2C17876832%2C791962%2C2374003%2C18950635%2C16140%2C10440912&geoId=92000000',
# Filter by companies
relevance=RelevanceFilters.RELEVANT,
time=TimeFilters.MONTH,
type=[TypeFilters.FULL_TIME, TypeFilters.INTERNSHIP],
experience=None,
)
)
),
]
scraper.run(queries)
I guess you ned to change your chrome executable path to your relevant path from 'NONE'
Hi there,
have you tried @ankurGhosh1 suggestion to explicitly pass your chromedriver
path?
scraper = LinkedinScraper(
chrome_executable_path='/path/to/chromedriver.exe',
headless=True,
max_workers=1,
slow_mo=1.3,
)
First of all, this is a fun project; thanks a bunch for sharing, I'm excited to test this out properly. Running on latest chromedriver, Windows 10, Python 3.63. I executed the example from the repository and received the above error.
The issue appears to have something to do with my chromedriver setup. I'm a relative novice to Selenium so any insight into how I could resolve this issue would be greatly appreciated! Here's a fuller account of the error:
line 169, in run headless=self.headless File "...\file_location\utils\chrome_driver.py", line 73, in build_driver driver = webdriver.Chrome(**kwargs) TypeError: init() got an unexpected keyword argument 'options' [ON_ERROR] init__() got an unexpected keyword argument 'options'
Thanks again for this stellar package and any tips on navigating this bug.