hxu296 / nlp-resume-parser

NLP-powered, GPT-3 enabled Resume Parser from PDF to JSON.
247 stars 50 forks source link
gpt-3 nlp nlp-parsing open-ai parser resume resume-parer

Resume Parser Service

GPT-3 based resume parser as a REST API that transforms a resume PDF like this to a JSON like this.

Parsing a resume PDF takes around 15 seconds and costs about $0.01 for every 500 tokens using text-davinci-002 engine (that's why there is no live demo website). Note that a typical request and response may use 1500 tokens ($0.03), 3000 tokens ($0.06) or more.

Please note that more accurate results may be achieved by fine-tuning GPT-3, but the out-of-the-box results from this repo are already very impressive.

Quick Start

  1. Install Python 3 and pip3. For macOS, see note below.
  2. Install all dependencies of pdftotext (see here).
  3. In a new terminal, update pip3 if needed: python3 -m pip install --upgrade pip
  4. In another new terminal, clone the repository and move Terminal to the directory.
    • Please close the other terminals and continue in this terminal.
  5. Check the versions: python3 --version and pip3 --version.
  6. Run the ./build.sh in the project root.
  7. Get your OpenAI API Key.
  8. Create a file named .env and set your API key in it: OPENAI_API_KEY=YOURKEY or set the key in an environment variable: export OPENAI_API_KEY=YOURKEY.
  9. Run ./run.sh in the project root.

A Flask server will start listening to port 5001 of localhost. Feel free to check it out with your browser.

Note for MacOS

You need to install either XCode or GCC tools (see here).

Supported Fields