PaulMcInnis / JobFunnel

Scrape job websites into a single spreadsheet with no duplicates.
MIT License
1.78k stars 210 forks source link
automated beautifulsoup beautifulsoup4 csv glassdoor indeed international job jobs monster python scraper search tfidf waterloo yaml

JobFunnel Banner
Build Status Code Coverage

Automated tool for scraping job postings into a .csv file.

Since this project was developed, CAPTCHA has clamped down hard, help us re-build the backend and make this tool useful again!

Benefits over job search sites:

masterlist.csv

Installation

JobFunnel requires Python 3.8 or later.

pip install git+https://github.com/PaulMcInnis/JobFunnel.git

Usage

By performing regular scraping and reviewing, you can cut through the noise of even the busiest job markets.

Configure

You can search for jobs with YAML configuration files or by passing command arguments.

Download the demo settings.yaml by running the below command:

wget https://git.io/JUWeP -O my_settings.yaml

NOTE:

Scrape

Run funnel with your settings YAML to populate your master CSV file with jobs from available providers:

funnel load -s my_settings.yaml

Review

Open the master CSV file and update the per-job status:

Advanced Usage

CAPTCHA

JobFunnel does not solve CAPTCHA. If, while scraping, you receive a Unable to extract jobs from initial search result page:\ error. Then open that url on your browser and solve the CAPTCHA manually.