opensemanticsearch / open-semantic-etl

Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
https://opensemanticsearch.org/etl
GNU General Public License v3.0
254 stars 69 forks source link

Reindex files on recrawl if additional plugins configured #87

Closed opensemanticsearch closed 5 years ago

opensemanticsearch commented 5 years ago

If additional plugin like OCR was enabled, reindex/ETL again files so the analysis of the additional plugin will be applied to yet indexed documents.

Mandalka commented 5 years ago

Implemented by https://github.com/opensemanticsearch/open-semantic-etl/commit/7e725c9064314fc2881ef75e1eaeff40874c46c9