Deprecate explicit distribution partition spec in services.xml (self-hosted Vespa)

vekterli commented 2 years ago

When using multiple content cluster groups in self-hosted Vespa there is a requirement to specify a distribution attribute that states how replicas are to be distributed across groups (see https://docs.vespa.ai/en/performance/sizing-examples.html). This is a legacy feature primarily designed for the case where you might have a complex topology of nested groups, which was more of a use-case for the long-since removed VDS persistence provider, which did not support indexed search of any kind.

Rationale

Syntax

The config syntax itself is far from obvious and may elicit a number of heartfelt questions from a user ("what numbers should I use?", "wildcards go where? what do they mean?", "why do I need a wildcard at the end?", "can dogs look up?" etc). There is nothing else in services.xml that uses this particular brand of mini-DSL.

Semantics

The semantics are expertly (though of course not intentionally) engineered for maximum confusion and surprise (tm). In particular, the spec 2|* with 2 groups and global redundancy of 3 does not, in fact, put 2 replicas in the first configured group and 1 in the second configured group. It puts 2 replicas in a group, more specifically the group that scored the highest during ideal state computation for the data bucket in question. It then assigns the remaining replica to the next lower scoring group (and so on, for each configured group).

It is likely that an unsuspecting user trying this out would assume that their 2x sized group 0 would receive 2x the amount of data as their lesser sized group 1, only to be mildly bemused when they end up receiving (on average) the same amount of data, possibly blowing up some resource limits along the way.

This is so non-obvious that it effectively renders the feature moot by itself.

Usefulness

Indexed search and query load-balancing do not work well in highly heterogeneous setups, as they assume groups are fairly uniform in their performance, i.e. explicit partition spec is an enabler of anti-patterns.

Proposal

Vespa Cloud entirely hides the nitty-gritty details of the partition spec by enforcing that groups are homogenous and that they have equal intra-group redundancy. This is a good thing; one of the core responsibilities of a distributed system is to shield from the user the fact that the world is cruel and that computers secretly hate us. We should do something similar for the common case in self-hosted as well.

Thus:

On Vespa 8:

If n > 1 groups are configured and no distribution is set, fail if redundancy is not evenly divisible by n. This is backwards compatible because we currently do require distribution to be set when multiple groups are present.
If distribution is explicitly set, emit a deprecation warning but otherwise allow what's configured
Warn if nested groups are configured. Fail if distribution is not set

On Vespa 9 and beyond:

Remove distribution entirely and enforce uniform group redundancy.
Remove support for arbitrarily nested group topologies

Up for bonus debate:

Should we change the semantics of redundancy in self-hosted to be equal to that of Vespa Cloud (i.e. per-group redundancy instead of global)? This would probably also be a Good Thing, but needs to be done with care.

bratseth commented 2 years ago

Sounds great to me. I suggest changing the redundancy semantics to be like cloud when no distribution is set.

baldersheim commented 11 months ago

@vekterli Where are we on this one ? Is everything for vespa 8 done ?

vekterli commented 11 months ago

@baldersheim nothing has happened here yet. But the spirit is strong and willing.

vespa-engine / vespa