opensemanticsearch / open-semantic-etl

Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
https://opensemanticsearch.org/etl
GNU General Public License v3.0
255 stars 69 forks source link

Where is spacy entity linking trained model? #98

Closed StudyExchange closed 4 years ago

StudyExchange commented 4 years ago

Where is spacy entity linking trained model? This project have entity linking feature suported by spacy, but I do not find spacy entity linking model. As I know, spacy entity linking model need trained by ourself, there is no official released version. How did you implement entity linking feature.

Mandalka commented 4 years ago

There are some official released models which are used here without changes.

Open Semantic ETL chooses a single model dependent on document language (mapping of languages to model in ETL config, so you can configure own models if trained some).

Implementation of REST-API is in spacy-services. That REST-API/service is used by our ETL plugin https://github.com/opensemanticsearch/open-semantic-etl/blob/master/src/opensemanticetl/enhance_ner_spacy.py

The standard spacy models are installed by Pip by following requirements config to your local python lib path used for pip installations:

https://github.com/explosion/spacy-services/blob/master/displacy/requirements.txt https://github.com/opensemanticsearch/spacy-services/blob/master/displacy/requirements.txt