Lookup error on 'extracted keywords from resume' - Githubissues

srbhr / Resume-Matcher

Resume Matcher is an open source, free tool to improve your resume. It works by using language models to compare and rank resumes with job descriptions.

https://www.resumematcher.fyi/

Apache License 2.0

4.76k stars 1.93k forks source link

Lookup error on 'extracted keywords from resume' #90

Closed thomassbooth closed 12 months ago

thomassbooth commented 12 months ago

Issue Title

Error occuring, looks like an import error. All readme steps followed properly.

Type

[ ] Big
[ ] Feature Request
[ ] Info
[x] Bug
[ ] Documentation
[ ] Other (please specify):

Description

After starting the streamlit server locally, on extracting keywords from the resume, it displays a lookup error.

Expected Behavior

The streamlit server to display the words extracted from the resume.

Current Behavior

An error is currently displaying.

Steps to Reproduce

Clone repo
Initialise venv
install dependencies
remove exsisting resumes and job descriptions.
Add in new Job description .pdf and Resume .pdf
Parse them
start the streamlit server.

Screenshots / Code Snippets (if applicable)

Error: LookupError: ********************************************************************** Resource [93mpunkt[0m not found. Please use the NLTK Downloader to obtain the resource: [31m>>> import nltk >>> nltk.download('punkt') [0m For more information see: https://www.nltk.org/data.html Attempted to load [93mtokenizers/punkt/PY3/english.pickle[0m Searched in: - '/Users/thomasbooth/nltk_data' - '/Users/thomasbooth/Documents/Personal/Resume-Matcher/env/nltk_data' - '/Users/thomasbooth/Documents/Personal/Resume-Matcher/env/share/nltk_data' - '/Users/thomasbooth/Documents/Personal/Resume-Matcher/env/lib/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data' - '' ********************************************************************** Traceback: File "/Users/thomasbooth/Documents/Personal/Resume-Matcher/env/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script exec(code, module.__dict__) File "/Users/thomasbooth/Documents/Personal/Resume-Matcher/streamlit_app.py", line 160, in <module> annotated_text(create_annotated_text( ^^^^^^^^^^^^^^^^^^^^^^ File "/Users/thomasbooth/Documents/Personal/Resume-Matcher/streamlit_app.py", line 86, in create_annotated_text tokens = nltk.word_tokenize(input_string) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/thomasbooth/Documents/Personal/Resume-Matcher/env/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 129, in word_tokenize sentences = [text] if preserve_line else sent_tokenize(text, language) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/thomasbooth/Documents/Personal/Resume-Matcher/env/lib/python3.11/site-packages/nltk/tokenize/__init__.py", line 106, in sent_tokenize tokenizer = load(f"tokenizers/punkt/{language}.pickle") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/thomasbooth/Documents/Personal/Resume-Matcher/env/lib/python3.11/site-packages/nltk/data.py", line 750, in load opened_resource = _open(resource_url) ^^^^^^^^^^^^^^^^^^^ File "/Users/thomasbooth/Documents/Personal/Resume-Matcher/env/lib/python3.11/site-packages/nltk/data.py", line 876, in _open return find(path_, path + [""]).open() ^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/thomasbooth/Documents/Personal/Resume-Matcher/env/lib/python3.11/site-packages/nltk/data.py", line 583, in find raise LookupError(resource_not_found)

Environment

Operating System: Mac OS: Ventura 13.4
Browser (if applicable): Chrome
Version/Commit ID (if applicable): Im on main branch.

Possible Solution (if you have any in mind)

Additional Information

srbhr commented 12 months ago

I think you're missing the punkt tokenizer.

Install/Download NLTK Data.

Run python in command line.
```
```bash
python
```
```

After that run.

```python
import nltk
nltk.download('punkt')
```

thomassbooth commented 12 months ago

Thanks:

I was getting an invalid ssl certificate error when downloading so had to use this (for anyone if they get the same error): https://stackoverflow.com/questions/38916452/nltk-download-ssl-certificate-verify-failed

thomassbooth commented 12 months ago

Nltk manual install needed.