Open dimitarvdimitrov opened 9 months ago
@beatkind tagging you because I don't think I can assign you without you participating in the issue. Can you assign yourself by any chance? From the github docs
You can assign multiple people to each issue or pull request, including yourself, anyone who has commented on the issue or pull request, anyone with write permissions to the repository, and organization members with read permissions to the repository. For more information, see "Access permissions on GitHub."
@dimitarvdimitrov And I need to actively write here :) to be participating - nope I am not able to assign myself, because I do not have any permissions inside the repo
@beatkind I saw you marked this as fixed by https://github.com/grafana/mimir/pull/7431. I'd like to keep this issue open until we also document the migration procedure. Migration is now technically possible, but it's not a very user-friendly process because users needs to figure out the steps themselves.
@dimitarvdimitrov thanks for reopening, this was simply a mistake :) - I will add some documentation with my next PR
hey @beatkind @dimitarvdimitrov , i just tried to follow Migration in a nutshel
guide, i found 2 issues
preserveReplicas
, my distributer, querier and query-fronted (we don't rulers) all got scaled to 11. after i removed
preserveReplicas
, my distributer, querier and query-fronted (we don't rulers) all got scaled to 1
how long after deploying the HPA did this happen? Was the HPA already working?
2. after couple of minutes they scaled backup to normal but the distributer was messed up, out of 18 replicas only one got traffic, the k8s services did list all of the pods correctly, and i didn't see errors in any of the pods, my guess is that sudden scale up messed up the ring/memberlist stuff
this looks like a problem with loadbalancing. The ring doesn't determine distributor load. Perhaps the reverse proxy in front of the distributor proxy didn't get the update quickly enough (or DNS was cached, etc.). What reverse proxy are you using - is it the nginx from the chart? Did this fix itself resolve eventually?
- almost immediately (up to maybe a minute later), and yes the HPA was working
I'd expect that doing it after the pollingInterval
of 10s
is enough for KEDA's HPA to take over, but can't be sure. We should retest this procedure before publishing the docs.
- almost immediately (up to maybe a minute later), and yes the HPA was working
I'd expect that doing it after the
pollingInterval
of10s
is enough for KEDA's HPA to take over, but can't be sure. We should retest this procedure before publishing the docs.
Good point, will validate it again with a test env
Just wanted to leave a note that we also encountered our replicas scaling down to 1 when we set preserveReplicas
to false
for mimir querier in our test environment. KEDA had been deployed several days before. So it seem like step 2.
of the migration isn't adequate (unless I'm missing something). We can't afford to have our queriers drop to 1 in production, so we would like some guidance on how to safely proceed. Or is KEDA autoscaling still too experimental, and should be avoided in production?
querier:
replicas: 3
kedaAutoscaling:
enabled: true
preserveReplicas: false
minReplicaCount: 3
maxReplicaCount: 40
https://github.com/grafana/mimir/pull/7282 added autoscaling to Helm as an experimental feature. This issue is about adding support in the helm chart for a smooth migration and adding documentation for the migration.
Why do we need a migration?
Migrating to a Mimir cluster with autoscaling requires a few intermediate steps to ensure that there are no disruptions to traffic. The major risk is that enabling autoscaling also removed the
replicas
field from Deployments. If KEDA/HPA hasn't started autoscaling the Deployment, then k8s interprets no replicas as meaning1
replica, which can cause an outage.Migration in a nutshel
distributor.kedaAutoscaling.preserveReplicas: true
field in the helm chart which doesn't delete the replicas field from the rendered manifests (https://github.com/grafana/mimir/pull/7431)preserveReplicas: true
, deploy the chart.preserveReplicas
fromvalues.yaml
and deploy the chartInternal docs
I'm also pasing Grafana Labs-internal documentation that's specific to our deployment tooling with FluxCD. Perhaps it can be used by folks running FluxCD or as a starting point for proper docs:
remove_managed_replicas.sh
```bash #!/usr/bin/env bash set -euo pipefail help() { echo "Usage: ./remove_managed_replicas.sh [ -c | --context ] [ -n | --namespace ] [ -o | --object ] [ -d | --dry-run ] [ -h | --help ] Outputs a diff of changes made to the object. " exit 2 } VALID_ARGUMENTS=$# if [ "$VALID_ARGUMENTS" -eq 0 ]; then help fi CONTEXT="" NAMESPACE="" OBJECT="" DRY_RUN=false DRY_RUN_ARG="" while [ "$#" -gt 0 ] do case "$1" in -c | --context ) CONTEXT="--context=${2}" shift 2 ;; -n | --namespace ) NAMESPACE="$2" shift 2 ;; -o | --object ) OBJECT="$2" shift 2 ;; -d | --dry-run ) DRY_RUN=true DRY_RUN_ARG="--dry-run=server" shift 1 ;; -h | --help) help ;; --) shift; break ;; *) echo "Unexpected option: ${1}" help ;; esac done if [ -z "${NAMESPACE}" ] then echo "Must supply a namespace." exit 1 fi if [ -z "${OBJECT}" ] then echo "Must supply a kubernetes object (such as \`-o Deployment/example\`)." exit 1 fi KC="kubectl ${CONTEXT} --namespace=${NAMESPACE}" BEFORE=$(${KC} get "${OBJECT}" -o yaml --show-managed-fields=true) BEFORE_JSON=$(${KC} get "${OBJECT}" -o json --show-managed-fields=true) INDEX=$(echo "${BEFORE_JSON}" | jq '.metadata.managedFields | map(.manager == "kustomize-controller") | index(true)') # Check we can find the position of flux's entry in managedFields: if ! [[ $INDEX =~ ^[0-9]+$ ]] then echo "Unable to find \`kustomize-controller\` (flux) in the managedFields metadata for ${OBJECT}." echo "Has flux not ran on this object before?" echo "This may happen if you have deployed the object manually." echo "It should be safe to continue removing the flux-ignore as the object was never managed by flux." exit 1 fi # Check that `.spec.replicas` is set in the managedFields entry CHECK=$(echo "${BEFORE_JSON}" | jq ".metadata.managedFields[${INDEX}].fieldsV1.\"f:spec\".\"f:replicas\"") if [ "${CHECK}" = "null" ] then echo "Unable to find \`.spec.replicas\` set in the managedFields metadata for \`kustomize-controller\`." echo "Has the field already been unset?" echo "This may happen if the HPA has already scaled the object." echo "It is safe to continue removing the flux-ignore on the object." exit 1 fi AFTER=$(${KC} patch "${OBJECT}" -o yaml --show-managed-fields=true ${DRY_RUN_ARG} --type='json' -p "[{'op': 'remove', 'path': '/metadata/managedFields/${INDEX}/fieldsV1/f:spec/f:replicas'}]") diff -u <(echo "${BEFORE}") <(echo "${AFTER}") || : if [ "${DRY_RUN}" = true ] then echo "" echo "Dry run only. No changes have been applied." fi exit 0 ```