Currently the process for adding entries goes as follows:
Add them to MariaDB
Commit
Add them to Elasticsearch
The problem is that if the backend crashes while adding the entries to Elasticsearch, we will have entries in MariaDB but not in Elasticsearch. (Note this can't be solved by swapping step 2 and 3, because if we crash while adding entries to Elasticsearch, we will have the opposite problem, entries in Elasticsearch but not MariaDB.)
One option for solving: create a new table in MariaDB of "recently added/deleted entries". We sync MariaDB to Elasticsearch by adding/removing the entries in this table to the index, then (when everything is done) emptying the table.
Another option: the index remembers what "revision" the repository was at when it was last synced (e.g., the max history_id in the entries table). Then to sync, we ask the database "give me all changes since this revision", and add/remove those to the index.
If the backend crashes during this process, it's OK - it just means that we will add/remove the synced entries again when we restart the backend.
Currently the process for adding entries goes as follows:
The problem is that if the backend crashes while adding the entries to Elasticsearch, we will have entries in MariaDB but not in Elasticsearch. (Note this can't be solved by swapping step 2 and 3, because if we crash while adding entries to Elasticsearch, we will have the opposite problem, entries in Elasticsearch but not MariaDB.)
One option for solving: create a new table in MariaDB of "recently added/deleted entries". We sync MariaDB to Elasticsearch by adding/removing the entries in this table to the index, then (when everything is done) emptying the table.
Another option: the index remembers what "revision" the repository was at when it was last synced (e.g., the max
history_id
in the entries table). Then to sync, we ask the database "give me all changes since this revision", and add/remove those to the index.If the backend crashes during this process, it's OK - it just means that we will add/remove the synced entries again when we restart the backend.