Closed niccottrell closed 5 years ago
Attaching the plot_split_distribution output for _id: 1
as the shard key (default), and for fld0: 1
respectively. Looks like fld0
is not truly random.
I think we can pick a default shard key with better distribution.
@josefahmad - the shard key in POCdriver is explicitly designed and chosen to be optimal - unlike a random shard key which is inherently bad it is supposed to be monotonically increasing from a low cardinality set of seed points. POCDriver was actually designed initially to demonstrate this principle. The shard key is optimal for writing and also supporting the internal mechanisms of POCDriver.
The current _id is not a great shard key, but then other options like the default fld0 isn't great either. Based on testing by @josefahmad the values don't seem to be properly random.