nestauk / ojd_daps_skills

Nesta's Skills Extractor Library
https://nestauk.github.io/ojd_daps_skills
MIT License
123 stars 19 forks source link

JobNER object has no attribute nlp #219

Closed avkhimen closed 9 months ago

avkhimen commented 9 months ago

Hello,

After installing with:

pip install ojd-daps-skills
python -m spacy download en_core_web_sm

and running:

>>> from ojd_daps_skills.pipeline.extract_skills.extract_skills import ExtractSkills
>>> es = ExtractSkills(config_name="extract_skills_toy", local=True)
>>> es.load()

I get the following error:

2024-02-14 20:49:07,185 - SkillsExtractor - INFO - Loading the model from a local location (ner_spacy.py:507)
2024-02-14 20:49:07,186 - SkillsExtractor - INFO - Loading the model from /home/ubuntu/.local/lib/python3.8/site-packages/ojd_daps_skills_data/outputs/models/ner_model/20220825/ (ner_spacy.py:510)
2024-02-14 20:49:07,188 - SkillsExtractor - INFO - Model not found locally - you may need to download it from S3 (set s3_download to True) (ner_spacy.py:516)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ojd_daps_skills/pipeline/extract_skills/extract_skills.py", line 147, in load
    self.nlp = self.job_ner.load_model(self.ner_model_path, s3_download=self.s3)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/ojd_daps_skills/pipeline/skill_ner/ner_spacy.py", line 519, in load_model
    return self.nlp
AttributeError: 'JobNER' object has no attribute 'nlp'

Please help.

avkhimen commented 9 months ago

Was able to solve with

pip install git+https://github.com/nestauk/ojd_daps_skills.git@dev

as discussed in previous issue post.

lizgzil commented 9 months ago

thanks for the update @avkhimen - we thought the new release would have solved this issue

ddeisadze commented 8 months ago

Hi I am still having issue with a local build, the exact same error, what is the fix? @lizgzil

lizgzil commented 8 months ago

hi @ddeisadze - sorry to hear that - I assume you are also using pip install git+https://github.com/nestauk/ojd_daps_skills.git@dev ?

ddeisadze commented 7 months ago

@lizgzil yes after I updated, it does not respond with any skills anymore :/

ddeisadze commented 7 months ago

@lizgzil quick question, is there anyway you can explain how the NER works on the backend? I would love to create a new model with ESCO taxonomy. Especially for hard skills see https://esco.ec.europa.eu/sites/default/files/Python%20%28computer%20programming%29.json

lizgzil commented 7 months ago

@ddeisadze - so can I confirm that now you don't have the AttributeError: 'JobNER' object has no attribute 'nlp' error but when you apply the NER model there are no skills extracted in the output?

Please could you provide an example of the text you are trying to extract skills from and/or any errors, so I can troubleshoot?

lizgzil commented 7 months ago

@ddeisadze the NER model is described in this documentation. The team spent time labelling skills in job adverts and then trained the model with this data. After the NER model is applied to a job advert and skills are extracted, the skills are then mapped to the ESCO taxonomy.

So skills will be extracted regardless of whether they are in the ESCO taxonomy, but they are only mapped to ESCO skills in the standardisation step. e.g. "React" isn't in ESCO but will be extracted as a skill, but it maps to the ESCO skill "use scripting programming".

So when you say I would love to create a new model with ESCO taxonomy. Especially for hard skills see , do you mean so that the NER model only picks out ESCO skills? or only maps them to ESCO hard skills? For the former, we hope the current algorithm addresses this already, but for the latter you would probably need to create a custom taxonomy for just the hard ESCO skills (as currently they are soft + hard ESCO skills).

Socvest commented 4 months ago

@ddeisadze - so can I confirm that now you don't have the AttributeError: 'JobNER' object has no attribute 'nlp' error but when you apply the NER model there are no skills extracted in the output?

Please could you provide an example of the text you are trying to extract skills from and/or any errors, so I can troubleshoot?

Hello, I am getting this error even after using the git install package at the local level. Any suggestions?