Closed yuvikk closed 5 months ago
@vekterli maybe we should consider better guiding on consequences of high distr key (ref the slack conversation) - the distr algo will slow down a lot with this, so maybe having a configurable upper bound is better, with a good error message at deploy
Yes, this particular use case (encoding host name patterns in the distribution keys) has been a recurring theme throughout the years. Doing so makes sense from an application modelling perspective, but makes the distribution algorithm give off blue smoke from burning CPU on generating pseudo-random numbers, and should therefore be discouraged.
As a start, we should certainly never allow deployments to pass validation when specifying distribution keys that exceed the internal type limits. Distribution keys are 16-bit integers internally, with UINT16_MAX
treated as a special sentinel. So the valid distribution key range is never outside [0, UINT16_MAX - 1]
.
It would be fairly trivial to create a new version of the distribution algorithm that is O(|configured nodes|) rather than O(highest configured distribution key), but doing so in a backwards compatible manner is Complicated™️ at the best of times, which is the reason why it hasn't been done yet...
Two enhancements have been made to the application deployment logic to address this:
distribution-key
documentation.Consequently, I'm marking this issue as closed.
Describe the bug The following change deployed successfully but crashed the entire Vespa cluster: From:
To:
To Reproduce Steps to reproduce the behavior: Deploy a high distribution key such as 124001 and 124002.
Logs
Environment (please complete the following information):
Linux version 4.18.0-372.9.1.el8.x86_64 (mockbuild@dal1-prod-builder001.bld.equ.rockylinux.org) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC)) #1 SMP Tue May 10 14:48:47 UTC 2022
Vespa version 8.320.68