Closed phish108 closed 1 year ago
This can be nicely integrated into the regular indexer via the message queue.
The algorithm will run all index terms against the reduced corpus of the added objects.
We use only indexer.add
for corpus indexing via the UI
We use only indexer.update
We use only importer.object
for indexing newly imported objects
the sdg-indexer operates on the entire corpus. This is highly inefficient, if only a few new objects are added to the corpus. In those cases a variation of the initial indexer should operate only on those few objects.
The algorithm is pretty straight forward:
anyofterms
in the keyword field. This reduces the search space of potential keywords to those that appear in the infoObject already.