We really rely on our sorting algorithm in some of our subsystems, e.g. we try to place objects in the most appropriate shards, and nodes.
Problem
I could imagine some sort of scenario when adding a new node/shard breaks sorting (that is not bad by itself) and could lead to unexpected object removal/keeping in storage:
Broadcasting LOCK object was done when a container node was not expected to store an object and the node wrote locking info in the shard it thought should store the object. Change container node and shard order -> probability to write the object (and a potential TS for it) in another shard -> unexpected removal.
Shard was RO -> locking/TS info came to not expected shard, will not be handled correctly when an object comes to that shard -> unexpected removal.
Node was added after TS/LOCK objects creation, -> in the replication process that node could accept TS/LOCK objects before their objects and, therefore, could place them to a different shard -> unexpected removal.
Metabase resync without real TS/LOCK object storing will erase that info.
...
Ideas
I was always thinking that storing a real object (not its meta relations only) in all the participating nodes/shards isn't a critical overhead but it could save us from hard-to-predict bugs and I still think so.
Maybe it is the time for MoveTo operation implementation.
All the shards could store meta relations of a container.
Context
We really rely on our sorting algorithm in some of our subsystems, e.g. we try to place objects in the most appropriate shards, and nodes.
Problem
I could imagine some sort of scenario when adding a new node/shard breaks sorting (that is not bad by itself) and could lead to unexpected object removal/keeping in storage:
Ideas
MoveTo
operation implementation.