rode / grafeas-elasticsearch

An implementation of the Grafeas storage backend based on Elasticsearch
Apache License 2.0
12 stars 5 forks source link

Specify document ids up front #79

Open alexashley opened 3 years ago

alexashley commented 3 years ago

Currently we have Elasticsearch generate a document id whenever a new occurrence, note, or project is created and then retrieve a single document by doing a query against one of its unique fields (e.g., occurrenceName).

Instead we should set the document id up front, using that unique field value, which will allow for a get to retrieve based on the document id.

This is more performant for reads and may remove the need to do a refresh on every write.

It should also give us stronger uniqueness guarantees in the index itself, rather than trying to manage that it application code (like we currently do for note names).

There's a downside in that the Elasticsearch auto-generated IDs can skip the uniqueness check, so there's overhead when creating the document. We may want to add some basic benchmark tests to see how much of a performance impact it is.