Closed windy1 closed 6 months ago
Hello @windy1 , I can reproduce the issue, thanks for reporting it!
After checking the code, it looks that we still use offset
field sometimes within the code, although we don't use it for any calculation (just for some ifs). For example here and here
Are you willing to open a PR including another check for the sequence? I guess that we cannot get rid of the offset field as it's used by Az functions AFAIK, but we should check if offset AND sequence are empty before returning an error
Hi @JorTurFer I decided I would take a look, but I haven't been able to build the dev container successfully on my machine:
[2024-03-11T18:42:59.417Z] ERROR: failed to solve: process "/bin/sh -c apt-get update && apt-get -y install --no-install-recommends apt-utils dialog unzip 2>&1 && apt-get -y install git iproute2 procps lsb-release && go get -x -d github.com/stamblerre/gocode 2>&1 && go build -o gocode-gomod github.com/stamblerre/gocode && mv gocode-gomod $GOPATH/bin/ && go get -u -v github.com/mdempsky/gocode github.com/uudashr/gopkgs/cmd/gopkgs github.com/ramya-rao-a/go-outline github.com/acroca/go-symbols github.com/godoctor/godoctor golang.org/x/tools/cmd/guru golang.org/x/tools/cmd/gorename github.com/rogpeppe/godef github.com/zmb3/gogetdoc github.com/haya14busa/goplay/cmd/goplay github.com/sqs/goreturns github.com/josharian/impl github.com/davidrjenni/reftools/cmd/fillstruct github.com/fatih/gomodifytags github.com/cweill/gotests/... golang.org/x/tools/cmd/goimports golang.org/x/lint/golint
[2024-03-11T18:42:59.417Z] github.com/alecthomas/gometalinter 2>&1 github.com/mgechev/revive github.com/derekparker/delve/cmd/dlv 2>&1 && go install honnef.co/go/tools/cmd/staticcheck@latest && go install golang.org/x/tools/gopls@latest && PROTOC_VERSION=21.9 && if [ $(dpkg --print-architecture) = \"amd64\" ]; then PROTOC_ARCH=\"x86_64\"; else PROTOC_ARCH=\"aarch_64\" ; fi && curl -LO \"https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/protoc-${PROTOC_VERSION}-linux-$PROTOC_ARCH.zip\" && unzip \"protoc-${PROTOC_VERSION}-linux-$PROTOC_ARCH.zip\" -d $HOME/.local && mv $HOME/.local/bin/protoc /usr/local/bin/protoc && mv $HOME/.local/include/ /usr/local/bin/include/ && protoc --version && curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.55.2 && groupadd --gid $USER_GID $USERNAME && useradd -s /bin/bash --uid $USER_UID --gid $USER_GID -m $USERNAME && apt-get in
[2024-03-11T18:42:59.417Z] stall -y sudo && echo $USERNAME ALL=\\(root\\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME && chmod 0440 /etc/sudoers.d/$USERNAME && sudo install -m 0755 -d /etc/apt/keyrings && curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg && sudo chmod a+r /etc/apt/keyrings/docker.gpg && echo \"deb [arch=\"$(dpkg --print-architecture)\" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \"$(. /etc/os-release && echo \"$VERSION_CODENAME\")\" stable\" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null && sudo apt-get update && apt-get install -y docker-ce-cli && apt-get -y install python3-pip && python3 -m pip install --no-cache-dir --break-system-packages pre-commit && apt-get autoremove -y && apt-get clean -y && rm -rf /var/lib/apt/lists/*" did not complete successfully: exit code: 1
[2024-03-11T18:42:59.423Z] Stop (43323 ms): Run: docker buildx build --load --build-arg BUILDKIT_INLINE_CACHE=1 -f /var/folders/bg/dth_vb4s44g88qnk9g42r2vr0000gp/T/devcontainercli/container-features/0.56.2-1710182536098/Dockerfile-with-features -t vsc-keda-bde1e7825acddf40d31270b78aad0daad7b61f69f004dcf7a3d5ac01177433e8 --target dev_containers_target_stage --build-arg _DEV_CONTAINERS_BASE_IMAGE=dev_container_auto_added_stage_label /Users/wzs02/code/atrius/keda/.devcontainer
[2024-03-11T18:42:59.424Z] Error: Command failed: docker buildx build --load --build-arg BUILDKIT_INLINE_CACHE=1 -f /var/folders/bg/dth_vb4s44g88qnk9g42r2vr0000gp/T/devcontainercli/container-features/0.56.2-1710182536098/Dockerfile-with-features -t vsc-keda-bde1e7825acddf40d31270b78aad0daad7b61f69f004dcf7a3d5ac01177433e8 --target dev_containers_target_stage --build-arg _DEV_CONTAINERS_BASE_IMAGE=dev_container_auto_added_stage_label /Users/wzs02/code/atrius/keda/.devcontainer
[2024-03-11T18:42:59.425Z] at BtA (/Users/wzs02/.vscode/extensions/ms-vscode-remote.remote-containers-0.348.0/dist/spec-node/devContainersSpecCLI.js:465:1933)
[2024-03-11T18:42:59.425Z] at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
[2024-03-11T18:42:59.425Z] at async K0 (/Users/wzs02/.vscode/extensions/ms-vscode-remote.remote-containers-0.348.0/dist/spec-node/devContainersSpecCLI.js:464:1841)
[2024-03-11T18:42:59.425Z] at async yH (/Users/wzs02/.vscode/extensions/ms-vscode-remote.remote-containers-0.348.0/dist/spec-node/devContainersSpecCLI.js:464:610)
[2024-03-11T18:42:59.425Z] at async StA (/Users/wzs02/.vscode/extensions/ms-vscode-remote.remote-containers-0.348.0/dist/spec-node/devContainersSpecCLI.js:481:3660)
[2024-03-11T18:42:59.425Z] at async ZC (/Users/wzs02/.vscode/extensions/ms-vscode-remote.remote-containers-0.348.0/dist/spec-node/devContainersSpecCLI.js:481:4775)
[2024-03-11T18:42:59.425Z] at async trA (/Users/wzs02/.vscode/extensions/ms-vscode-remote.remote-containers-0.348.0/dist/spec-node/devContainersSpecCLI.js:614:11269)
[2024-03-11T18:42:59.425Z] at async erA (/Users/wzs02/.vscode/extensions/ms-vscode-remote.remote-containers-0.348.0/dist/spec-node/devContainersSpecCLI.js:614:11010)
[2024-03-11T18:42:59.429Z] Stop (44293 ms): Run: /Applications/Visual Studio Code.app/Contents/Frameworks/Code Helper (Plugin).app/Contents/MacOS/Code Helper (Plugin) /Users/wzs02/.vscode/extensions/ms-vscode-remote.remote-containers-0.348.0/dist/spec-node/devContainersSpecCLI.js up --user-data-folder /Users/wzs02/Library/Application Support/Code/User/globalStorage/ms-vscode-remote.remote-containers/data --container-session-data-folder /tmp/devcontainers-9e07ae93-2d2b-4cef-9981-98e1be114e9a1710182534355 --workspace-folder /Users/wzs02/code/atrius/keda --workspace-mount-consistency cached --id-label devcontainer.local_folder=/Users/wzs02/code/atrius/keda --id-label devcontainer.config_file=/Users/wzs02/code/atrius/keda/.devcontainer/devcontainer.json --log-level debug --log-format json --config /Users/wzs02/code/atrius/keda/.devcontainer/devcontainer.json --default-user-env-probe loginInteractiveShell --mount type=volume,source=vscode,target=/vscode,external=true --skip-post-create --update-remote-user-uid-default on --mount-workspace-git-root
[2024-03-11T18:42:59.429Z] Exit code 1
[2024-03-11T18:42:59.432Z] Command failed: /Applications/Visual Studio Code.app/Contents/Frameworks/Code Helper (Plugin).app/Contents/MacOS/Code Helper (Plugin) /Users/wzs02/.vscode/extensions/ms-vscode-remote.remote-containers-0.348.0/dist/spec-node/devContainersSpecCLI.js up --user-data-folder /Users/wzs02/Library/Application Support/Code/User/globalStorage/ms-vscode-remote.remote-containers/data --container-session-data-folder /tmp/devcontainers-9e07ae93-2d2b-4cef-9981-98e1be114e9a1710182534355 --workspace-folder /Users/wzs02/code/atrius/keda --workspace-mount-consistency cached --id-label devcontainer.local_folder=/Users/wzs02/code/atrius/keda --id-label devcontainer.config_file=/Users/wzs02/code/atrius/keda/.devcontainer/devcontainer.json --log-level debug --log-format json --config /Users/wzs02/code/atrius/keda/.devcontainer/devcontainer.json --default-user-env-probe loginInteractiveShell --mount type=volume,source=vscode,target=/vscode,external=true --skip-post-create --update-remote-user-uid-default on --mount-workspace-git-root
[2024-03-11T18:42:59.432Z] Exit code 1
OS: macOS Sonoma 14.3.1 Docker: latest
Nice catch! I've drafted a PR with a fix for devcontainers image. Could you apply that change? Basically you have to remove the line golang.org/x/tools/cmd/guru
because it's deprecated
Are you willing to open a PR including another check for the sequence? I guess that we cannot get rid of the offset field as it's used by Az functions AFAIK, but we should check if offset AND sequence are empty before returning an error
Regarding this bit, I was toying around with this today and I eventually came to the conclusion that checking if the offset, or the sequence number, is wholly redundant; at least in a dotnet / Azure SDK context.
Unless I am misunderstanding here, there shouldn't ever exist checkpoints where the sequence number is empty, and as the code is written right now, this would fail anyway as it is expected to be an integer. Checkpoints for partitions that have never been checkpointed would simply not exist, thus; initialized checkpoints will always have a sequence number.
It's very possible I am missing information about the checkpointing implementations in other contexts, such as Azure Functions, as you mentioned, but as far as I can tell, this is an impossible scenario if you are creating checkpoints through the Azure SDK.
Yeah, that's the point I meant. Sequence number is the used property for all the calculations, but although the offset isn't used anywhere for the calculations, there are some if statements where the offset is used instead of the sequence number. We have to get rid of the offset usage for is statements in favor of sequence number
What I'm saying is that I think the section you linked before can be removed completely and not replaced by some alternative method.
I get your point, and I think you're right xD
Report
Related: https://github.com/Azure/azure-sdk-for-net/issues/42409
As of
Azure.Messaging.EventHubs v5.11.0
, the format of checkpoints written to blob storage has changed. Notably, the value ofoffset
isnull
. In our case, this caused KEDA to over-scale a service. Downgrading the SDK to v5.10.0 resolved the issue.In response to my original issue, it was asserted by a Microsoft contributor that this change is intentional and that the implementation of checkpoints are not to be relied upon.
Please refer to the original issue for more in-depth details and reproduction steps.
Expected Behavior
I expected KEDA to scale my service appropriately.
Actual Behavior
KEDA over-scaled the service until the Azure SDK was downgraded and the old checkpoint format restored.
Steps to Reproduce the Problem
Logs from KEDA operator
No response
KEDA Version
2.12.1
Kubernetes Version
None
Platform
Microsoft Azure
Scaler Details
Azure Event Hubs
Anything else?
No response