epam / Indigo

Universal cheminformatics toolkit, utilities and database search tools
http://lifescience.opensource.epam.com
Apache License 2.0
290 stars 99 forks source link

Pull IndigoRecord attribute as the document ID for index updates in bingo-elastic #1987

Open rvaidya opened 3 weeks ago

rvaidya commented 3 weeks ago

Currently, there is no way to update existing records in ElasticSearch when calling ElasticRepository.indexRecords in bingo. This leads to lots of duplicate documents being inserted.

This uses the IndigoRecord internalID field as the ID field for the document, so that existing documents can be updated.

rvaidya commented 3 weeks ago

Made the attribute used for document id configurable - can either provide your own mapper from IndigoRecord to document id (if you want to pull from custom objects), or use IndigoRecord.internalID

Also provide optional implementation for auto flushing index