If there is failure of the file system where the index shards reside then the following will happen in stroom.index.impl.IndexShardWriterCacheImpl#getWriterByShardKey:
Tries to open an existing shard.
If there is no matching shard rec on the db OR there is but there is an exception opening the shard writer then a null writer is returned.
If a null writer is returned it will attempt to create a new shard rec in the DB then open a writer for this new shard.
If opening the writer on this shard also fails (likely if there is a FS problem) then then this shard is marked corrupt in the db.
Subsequent threads will do the same thing resulting in many empty shards being created and marked corrupt.
It needs to better handling the failure conditions, e.g. potentially not trying to create a new shard if it errors opening one known to exist.
If there is failure of the file system where the index shards reside then the following will happen in
stroom.index.impl.IndexShardWriterCacheImpl#getWriterByShardKey
:Tries to open an existing shard. If there is no matching shard rec on the db OR there is but there is an exception opening the shard writer then a null writer is returned. If a null writer is returned it will attempt to create a new shard rec in the DB then open a writer for this new shard. If opening the writer on this shard also fails (likely if there is a FS problem) then then this shard is marked corrupt in the db.
Subsequent threads will do the same thing resulting in many empty shards being created and marked corrupt.
It needs to better handling the failure conditions, e.g. potentially not trying to create a new shard if it errors opening one known to exist.