SmartDataAnalytics / HORUS-NER

HORUS: A framework to boost NLP tasks
Apache License 2.0
50 stars 5 forks source link

Search engine: metadata #55

Open diegoesteves opened 4 years ago

diegoesteves commented 4 years ago

I have found an issue that might be affecting the performance of the NER model. I am currently using a sort of snippet of the webpage (from the Microsoft search engine) instead of the webpage body content? Check if that is correct (currently, the DB just contains a field named HORUS_SEARCH_RESULT_TEXT.result_description. I don't recall if I am scrapping and saving the body somewhere else to later perform the text classification on it, instead of the description, only.