Open Bhavin1996 opened 3 years ago
Hey , make sure you have installed correct spacy==2.3.5 and en_core_web_sm==2.3.1 version. https://colab.research.google.com/drive/1p6rhi9g0ughtGBojnCJcPVRRNqziuk3K?usp=sharing see this colab notebook.
I have run python -m spacy validate and confirmed that spacy version 2.3.5 and en_core_web_sm version is 2.3.1 When I run from resumeparser run resumeparse I get user warning [w031] message that says Model 'en_training' (0.0.0) requires spacy 2.2 and is incompatible with spacy 2.3.5
Hey , make sure you have installed correct spacy==2.3.5 and en_core_web_sm==2.3.1 version. https://colab.research.google.com/drive/1p6rhi9g0ughtGBojnCJcPVRRNqziuk3K?usp=sharing see this colab notebook.
I too encounter this issue. Yes it works fine in Colab, along with some warning but when I run on my Ubuntu server, with the warning it get struck.
Hey @Jeyandranath , can you please share some logs from where the process stuck. Also can you share the resume on which it stuck.
Hey , make sure you have installed correct spacy==2.3.5 and en_core_web_sm==2.3.1 version. https://colab.research.google.com/drive/1p6rhi9g0ughtGBojnCJcPVRRNqziuk3K?usp=sharing see this colab notebook.
I too encounter this issue. Yes it works fine in Colab, along with some warning but when I run on my Ubuntu server, with the warning it get struck.
Tested in Windows, Works fine with the warning below : UserWarning: [W031] Model 'en_training' (0.0.0) requires spaCy v2.2 and is incompatible with the current spaCy version (2.3.5). This may lead to unexpected results or runtime errors. To resolve this, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate warnings.warn(warn_msg)
data = resumeparse.read_file('hello.pdf') 2021-03-21 00:40:45,448 [MainThread ] [INFO ] Retrieving http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server/1.24/tika-server-1.24.jar to C:\Users\CHARUJ~1\AppData\Local\Temp\tika-server.jar. 2021-03-21 00:41:16,323 [MainThread ] [INFO ] Retrieving http://search.maven.org/remotecontent?filepath=org/apache/tika/tika-server/1.24/tika-server-1.24.jar.md5 to C:\Users\CHARUJ~1\AppData\Local\Temp\tika-server.jar.md5. 2021-03-21 00:41:19,471 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2021-03-21 00:41:24,493 [MainThread ] [WARNI] Failed to see startup log message; retrying... print(data) {'email': 'bshravan85@hotmail.com', 'phone': '+91-98845-92980', 'name': 'SHRAVAN KUMAR', 'total_exp': 4, 'university': [], 'designition': ['finance analyst', 'operations tech', 'deputy manager'], 'degree': ['B.Com Degree'], 'skills': ['Known: Tamil', ' English', ' and Tulu', 'Present Address: 22 Vijayalakshmi Avenue', 'Poonamallee', ' Chennai-56'], 'Companies worked at': ['92980', 'SAP', 'Hyundai Motor India Ltd', 'Hyundai Motor India Ltd.']}
Hey @Jeyandranath , can you please share some logs from where the process stuck. Also can you share the resume on which it stuck. After this Warning in Ubuntu: hello.pdf
UserWarning: [W031] Model 'en_training' (0.0.0) requires spaCy v2.2 and is incompatible with the current spaCy version (2.3.5). This may lead to unexpected results or runtime errors. To resolve this, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate warnings.warn(warn_msg)
I think Java is the issue...
There is no file in the path resume_parser\degree\model\
called config.cfg
- even on the github repository. What are the contents of the config.cfg?
Yep, same problem here within a Python 3.8 virtual environment (I followed the official installation instructions from here):
>>> from resume_parser import resumeparse
/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py:715: UserWarning: [W094] Model 'en_training' (0.0.0) specifies an under-constrained spaCy version requirement: >=2.2.4. This can lead to compatibility problems with older versions, or as new spaCy versions are released, because the model may say it's compatible when it's not. Consider changing the "spacy_version" in your meta.json to a version range, with a lower and upper pin. For example: >=3.0.5,<3.1.0
warnings.warn(warn_msg)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/__init__.py", line 1, in <module>
from resume_parser.resumeparse import resumeparse
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/resumeparse.py", line 50, in <module>
custom_nlp2 = spacy.load(os.path.join(base_path,"degree","model"))
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/__init__.py", line 47, in load
return util.load_model(name, disable=disable, exclude=exclude, config=config)
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 324, in load_model
return load_model_from_path(Path(name), **kwargs)
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 388, in load_model_from_path
config = load_config(config_path, overrides=dict_to_dot(config))
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 545, in load_config
raise IOError(Errors.E053.format(path=config_path, name="config.cfg"))
OSError: [E053] Could not read config.cfg from /home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/degree/model/config.cfg
That config file does not actually exist in that position, but if it is located in another position, I can move it there. Where it is and what should it contain?
After some experiments, I managed to find the config.cfg
file inside my virtual environment (it was located inside ~/.virtualenvs/rsm/lib/python3.8/site-packages/en_core_web_sm/en_core_web_sm-3.0.0
), so I copied it to the folder required by resume_parser
, so the previous error was solved, but another one appears:
>>> from resume_parser import resumeparse
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/__init__.py", line 1, in <module>
from resume_parser.resumeparse import resumeparse
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/resumeparse.py", line 50, in <module>
custom_nlp2 = spacy.load(os.path.join(base_path,"degree","model"))
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/__init__.py", line 47, in load
return util.load_model(name, disable=disable, exclude=exclude, config=config)
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 324, in load_model
return load_model_from_path(Path(name), **kwargs)
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 390, in load_model_from_path
return nlp.from_disk(model_path, exclude=exclude)
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/language.py", line 1863, in from_disk
util.from_disk(path, deserializers, exclude)
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 1174, in from_disk
reader(path / key)
File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/language.py", line 1849, in <lambda>
deserializers["tokenizer"] = lambda p: self.tokenizer.from_disk(
File "spacy/tokenizer.pyx", line 740, in spacy.tokenizer.Tokenizer.from_disk
File "spacy/tokenizer.pyx", line 803, in spacy.tokenizer.Tokenizer.from_bytes
File "spacy/tokenizer.pyx", line 570, in spacy.tokenizer.Tokenizer._load_special_cases
File "spacy/tokenizer.pyx", line 589, in spacy.tokenizer.Tokenizer._validate_special_case
ValueError: [E1005] Unable to set attribute 'POS' in tokenizer exception for ' '. Tokenizer exceptions are only allowed to specify ORTH and NORM.
This is harder to understand... do you have any suggestions?
Hey please make sure your requirements are matched like this spacy==2.3.5 and en_core_web_sm==2.3.1 . config.cfg is spacy configuration file it will be downloaded when we install en_core_web_sm package. I will try to update model as i get some time. Thanks
I have the same problems like this and I've installed library following requirements but its doesn't work for me.
I have faced the same issue of runtime stuck while importing resume_parser (with spacy 2.3.5 and en_core_web_sm 2.3.1). Even the colab notebook also got stuck at same code execution. Could you fix this issue or let us know what is the reason for this issue?
After some experiments, I managed to find the
config.cfg
file inside my virtual environment (it was located inside~/.virtualenvs/rsm/lib/python3.8/site-packages/en_core_web_sm/en_core_web_sm-3.0.0
), so I copied it to the folder required byresume_parser
, so the previous error was solved, but another one appears:>>> from resume_parser import resumeparse Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/__init__.py", line 1, in <module> from resume_parser.resumeparse import resumeparse File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/resume_parser/resumeparse.py", line 50, in <module> custom_nlp2 = spacy.load(os.path.join(base_path,"degree","model")) File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/__init__.py", line 47, in load return util.load_model(name, disable=disable, exclude=exclude, config=config) File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 324, in load_model return load_model_from_path(Path(name), **kwargs) File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 390, in load_model_from_path return nlp.from_disk(model_path, exclude=exclude) File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/language.py", line 1863, in from_disk util.from_disk(path, deserializers, exclude) File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/util.py", line 1174, in from_disk reader(path / key) File "/home/bartoli/.virtualenvs/rsm/lib/python3.8/site-packages/spacy/language.py", line 1849, in <lambda> deserializers["tokenizer"] = lambda p: self.tokenizer.from_disk( File "spacy/tokenizer.pyx", line 740, in spacy.tokenizer.Tokenizer.from_disk File "spacy/tokenizer.pyx", line 803, in spacy.tokenizer.Tokenizer.from_bytes File "spacy/tokenizer.pyx", line 570, in spacy.tokenizer.Tokenizer._load_special_cases File "spacy/tokenizer.pyx", line 589, in spacy.tokenizer.Tokenizer._validate_special_case ValueError: [E1005] Unable to set attribute 'POS' in tokenizer exception for ' '. Tokenizer exceptions are only allowed to specify ORTH and NORM.
This is harder to understand... do you have any suggestions?
I have the same issue . Do you have any suggestions please?
I have faced the same issue of runtime stuck while importing resume_parser (with spacy 2.3.5 and en_core_web_sm 2.3.1). Even the colab notebook also got stuck at same code execution. Could you fix this issue or let us know what is the reason for this issue?
i have also encounter this. can you please check in local by installing the same way installation done in colab. i will solve it as i will get time.
Hey guys, i have solved it in colab notebook . If you want to install it in local please follow the steps below.
pip install resume-parser
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz
pip install importlib-metadata==3.2.0
Now you can use the library.
I had some issues to understand correctly the steps, so here are my additions to @kbrajwani -s comments.
Thanks @sz332 For sharing your experience.
OSError: [E053] Could not read config.cfg from C:\Users\bhavi\AppData\Local\Programs\Python\Python39\lib\site-packages\resume_parser\degree\model\config.cfg