The file track_new_jobs_greenhouse.py provides functions which enable extraction of new jobs posted on company websites which use Greenhouse as their Applicant Tracking System
The file run.py takes as input an Excel file with format [Target Company, Careers Site Link, Status, Format, Category] (see sample file provided) and uses the files track_new_jobs_greenhouse.py and track_new_jobs_workday.py to extract new jobs posted on the websites mentioned in the Excel file
New jobs can only be extracted for a given company's website if an existing data frame (Excel file) for that company containing roles available on the company's careers page is present in the Dataframes directory (sample provided). This is because the existing data frame is compared with the data frame scraped from the web at the time the scripts are run to find new job postings
If an existing data frame for the company is not present in the Dataframes directory (i.e. you are scraping the company's career website for the first time), use the create_jobsdf_greenhouse(company_name, url, save_to_excel = True) and create_jobsdf_workday(company_name, url, save_to_excel = True) to create and save initial data frames
Comments Regarding Existing Files
Some improvements have also been made to the track_new_jobs_workday.py file.
The record_creation_routine.py file, when executed, creates folders in the directory of choice with copies of CV/ resume added to those folders. This is intended to provide a basic automated solution to maintain records of jobs applied for
The fillform.py file attempts to fill basic candidate information (Name, Contact No., Location, School etc.) on a job posting page in Greenhouse. However, this needs to be improved and is not ready to use at this point
New Additions
The file
track_new_jobs_greenhouse.py
provides functions which enable extraction of new jobs posted on company websites which use Greenhouse as their Applicant Tracking SystemOnly companies whose Greenhouse career pages follow the format https://boards.greenhouse.io/[company_name] can be extracted at this time
The file
run.py
takes as input an Excel file with format [Target Company, Careers Site Link, Status, Format, Category] (see sample file provided) and uses the filestrack_new_jobs_greenhouse.py
andtrack_new_jobs_workday.py
to extract new jobs posted on the websites mentioned in the Excel fileNew jobs can only be extracted for a given company's website if an existing data frame (Excel file) for that company containing roles available on the company's careers page is present in the Dataframes directory (sample provided). This is because the existing data frame is compared with the data frame scraped from the web at the time the scripts are run to find new job postings
If an existing data frame for the company is not present in the Dataframes directory (i.e. you are scraping the company's career website for the first time), use the
create_jobsdf_greenhouse(company_name, url, save_to_excel = True)
andcreate_jobsdf_workday(company_name, url, save_to_excel = True)
to create and save initial data framesComments Regarding Existing Files
Some improvements have also been made to the
track_new_jobs_workday.py
file.The
record_creation_routine.py
file, when executed, creates folders in the directory of choice with copies of CV/ resume added to those folders. This is intended to provide a basic automated solution to maintain records of jobs applied forThe
fillform.py
file attempts to fill basic candidate information (Name, Contact No., Location, School etc.) on a job posting page in Greenhouse. However, this needs to be improved and is not ready to use at this point