mikegoatly / lifti

A lightweight full text indexer for .NET
MIT License
181 stars 9 forks source link

Track source object type against a document's metadata #93

Closed mikegoatly closed 9 months ago

mikegoatly commented 9 months ago

As part of the score boosting work for #72 it has become necessary to track which source object type a document was obtained from. Without this we don't know which object type's scoring metadata to update when removing an item.

Proposal:

  1. When using WithObjectTokenization, track an internal unique object type id, similar to how we uniquely track field ids.
  2. Update ItemMetadata (Soon to be DocumentMetadata #92) to track the object type id it was stored for. For loose text documents, no object type id will be stored.
  3. Persist this information with the serialized index.