workforce-data-initiative / skills-ml

Data Processing and Machine learning methods for the Open Skills Project
https://workforce-data-initiative.github.io/skills-ml/
Other
168 stars 69 forks source link

ValueError: substring not found #290

Open nawabhussain opened 5 years ago

nawabhussain commented 5 years ago

Any Idea what is wrong here?

Traceback (most recent call last): File "/home/nawab/Documents/Project/skills-ml/examples/SkillExtractionEvaluation.py", line 60, in candidate_skills = candidate_skills_from_sample(sample, skill_extractor) File "/home/nawab/Documents/Project/skills-ml/skills_ml/evaluation/skill_extractors.py", line 21, in candidate_skills_from_sample skill_extractor.candidate_skills(json.loads(line)) File "/home/nawab/Documents/Project/skills-ml/skills_ml/algorithms/skill_extractors/section_extract.py", line 31, in candidate_skills spans_in_section = section_extract(self.section_regex, source_object['description']) File "/home/nawab/Documents/Project/skills-ml/skills_ml/algorithms/nlp/init.py", line 228, in section_extract start_index=unit.start_index + unit.text.index(stripped) ValueError: substring not found

rareal commented 5 years ago

having the exact same error...

khieunguyen commented 4 years ago

Let edit line number 262 in file algorithms/nlp/init.py : _line = line.replace(bulletchar, '') replaced by _line = line.replace(bulletchar, '',1) Then it should works.

NaghmehShahverdi commented 3 years ago

I have exactly the same problem. @khieunguyen But I made the change that u mentioned above and didn't solve my problem! Any other solution? I get this error when I run the following code:

candidate_skills = candidate_skills_from_sample(sample, skill_extractor)

ValueError                                Traceback (most recent call last)
<ipython-input-37-6aa3fdbd4eb6> in <module>
     43 for skill_extractor in skill_extractors:
     44     print(f'Evaluating skill extractor {skill_extractor.name}')
---> 45     candidate_skills = candidate_skills_from_sample(sample, skill_extractor)
     46 
     47     computed_metrics = metrics_for_candidate_skills(

/opt/anaconda3/lib/python3.7/site-packages/Skills_ML-2.1.0-py3.7.egg/skills_ml/evaluation/skill_extractors.py in candidate_skills_from_sample(sample, skill_extractor, output_storage)
     19     for line in sample:
     20         all_candidate_skills.extend(
---> 21             skill_extractor.candidate_skills(json.loads(line))
     22         )
     23     if output_storage:

/opt/anaconda3/lib/python3.7/site-packages/Skills_ML-2.1.0-py3.7.egg/skills_ml/algorithms/skill_extractors/section_extract.py in candidate_skills(self, source_object)
     29         """
     30 
---> 31         spans_in_section = section_extract(self.section_regex, source_object['description'])
     32         for span in spans_in_section:
     33             logging.info('Yielding candidate skill %s', span)

/opt/anaconda3/lib/python3.7/site-packages/Skills_ML-2.1.0-py3.7.egg/skills_ml/algorithms/nlp/__init__.py in section_extract(section_regex, document)
    226             units_in_section.append(Span(
    227                 text=stripped,
--> 228                 start_index=unit.start_index + unit.text.index(stripped)
    229             ))
    230     return units_in_section

ValueError: substring not found