hxu296 / nlp-resume-parser

NLP-powered, GPT-3 enabled Resume Parser from PDF to JSON.
247 stars 50 forks source link

Some additional improvements; please see commit messages #5

Closed ADTC closed 1 year ago

ADTC commented 1 year ago

Please see the individual commits for more details. Thank you. You can see the preview here.

Note: I decided against adding python-dotenv package as a simple parser would be enough for the .env file in our use case.

Regarding tiktoken, see: https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb

hxu296 commented 1 year ago

Hey, thank you for using tiktoken for exact token counting rather than the previous raw estimate! The .env suppport also looks good. One small thing: I think it's a good idea for parse_env_file to look into the environment variables first, and read from .env if OPENAI_API_KEY is not set in the environment. This way, we can still overwrite OPENAI_API_KEY on the fly for tests, and .env remains a prod option. What do you think? If you would like to work on it in this PR, I can wait to approve that change also. If you want to work on it in another PR, I can go ahead to approve & merge this PR for now.

ADTC commented 1 year ago

I'll add it here.

ADTC commented 1 year ago

@hxu296 I've updated as requested, via a force push to change the commits. Please take a look.

The changes ensure it still works if there's no .env file.