I noticed a bug where something is causing items to be removed form the elasticsearch index. I believe that something is either the script for importing HM articles or the script for processing them. I'm guessing that there is a problem with the way the river is handling document modifications.
I investigated this further and I'm still puzzled about what's going on. Here are some observations:
I ran the HM import script and during the "checking database" stage the document count on the girderSearch page went from 157713 to 154710 when the page was refreshed. When the import script completed the count increased to 154863 (due to new documents that were imported).
Later on the count decreased to 154856 for no apparent reason.
I reindexed the database and the count went up to 155006
Over the next hour I tried refreshing the search page a few times and the count seemed to randomly vary: 154983 , 154993, 154996, 154982
There are 155034 items in the mongo collection being indexed.
I noticed a bug where something is causing items to be removed form the elasticsearch index. I believe that something is either the script for importing HM articles or the script for processing them. I'm guessing that there is a problem with the way the river is handling document modifications.