chris-lovejoy / job-scraper

Scraping jobs from Indeed or CW jobs
86 stars 48 forks source link

Issue with Indeed connection #2

Open osleland opened 3 weeks ago

osleland commented 3 weeks ago

I downloaded the .py file and wanted to test the example as given but received the following error message:


AttributeError Traceback (most recent call last) Input In [4], in <cell line: 1>() ----> 1 find_jobs_from('Indeed', 'data scientist', 'london', desired_characs)

File ~\Desktop\Research\code\jobs\job_scraper.py:36, in find_jobs_from(website, job_title, location, desired_characs, filename) 34 if website == 'Indeed': 35 job_soup = load_indeed_jobs_div(job_title, location) ---> 36 jobs_list, num_listings = extract_job_information_indeed(job_soup, desired_characs) 38 if website == 'CWjobs': 39 location_of_driver = os.getcwd()

File ~\Desktop\Research\code\jobs\job_scraper.py:69, in extract_job_information_indeed(job_soup, desired_characs) 68 def extract_job_information_indeed(job_soup, desired_characs): ---> 69 job_elems = job_soup.findall('div', class='jobsearch-SerpJobCard') 71 cols = [] 72 extracted_info = []

AttributeError: 'NoneType' object has no attribute 'find_all'

As I'm in the US I tried changing line 62 to

url = ('https://indeed.com/jobs?' + urllib.parse.urlencode(getVars))

but received the same error. Does the URL need to be updated to connect with Indeed?

Many thanks in advance!

chris-lovejoy commented 2 weeks ago

hey, yes it hasn't been updated in a while so I suspect so! feel free to put up a PR if you have a chance to review and update the URL (and potentially the way the variables are parsed). If not, we can leave this issue open until somebody picks it up.