giterdun345 / Job-Description-Skills-Extractor

Given a job description, the model uses POS and Classifier to determine the skills therein.
31 stars 10 forks source link

Image of WordCloud

Job-Description-Skills-Extractor

Given a job description, the model uses POS, Chunking and a classifier with BERT Embeddings to determine the skills therein. You can find the Medium article with a full explanation here: https://medium.com/@johnmketterer/automating-the-job-hunt-with-transfer-learning-part-1-289b4548943

  1. JD Skills Preprocessing: Preprocesses and cleans indeed dataset, analysis is
  2. POS & Chunking EDA: Identified the Parts of Speech within each job description and analyses the structures to identify patterns that hold job skills
  3. regex_chunking: uses regex expressions for Chunking to extract patterns that include desired skills
  4. extraction_model_build_trainset: python file to sample data (extracted POS patterns) from pickle files
  5. extraction_model_trainset_analysis: Analysis of training data set to ensure data integrety beofre training
  6. extraction_model_training: trains model with BERT embeddings
  7. extraction_model_evaluation: evaluation on unseen data both data science and sales associate job descriptions; predictions1.csv and predictions2.csv respectively
  8. extraction_model_use: input a job description and have a csv file with the extracted skills; hf5 weights have not yet been uploaded and will also automate further for down stream task

Further readme description, hf5 weights, pickle files and original dataset to be added soon

Data obtained from: