Open yonirab opened 1 month ago
My guess is that one of your workflows has non-ascii characters in it that are upsetting sqllite. It's unclear if they're actually an invalid unicode escape sequence.
Possibly in something like the description of the workflow.
Can you look at the workflows in the cluster and work out which one(s) it might be and attach (a redacted version) of them. Ideally in a zip or tar file so as not to lose any bad escape characters via copy and pasting into github.
any idea @jiachengxu @linzhengen @MasonM
https://www.postgresql.org/message-id/368156.1677514339%40sss.pgh.pa.us mentions JSONB is stricter than JSON
@tooptoop4 Good find, but the JSONB migration (https://github.com/argoproj/argo-workflows/pull/13779) wasn't in 3.5.11, so unless @yonirab has done the migration manually, that's not the cause.
My first thought was that this is a regression from https://github.com/argoproj/argo-workflows/pull/12912, since that changed the SELECT
queries to use JSON operators, which will fail with unsupported Unicode escape sequence
regardless of whether the data is stored as text
, json
, or jsonb
:
postgres=# insert into argo_archived_workflows (uid, instanceid, name, namespace, phase, clustername, workflow) values ('0b3ec561-bc78-400c-9772-ce11c69b3d65', '', 'dg92nw8qr4k', 'argo', 'Succeeded', 'default', '{ "key2": "\u0000", "key1": 1}');
INSERT 0 1
postgres=# select workflow::json from argo_archived_workflows;
workflow
--------------------------------
{ "key2": "\u0000", "key1": 1}
(1 row)
postgres=# select (workflow::json)->'key2' from argo_archived_workflows;
ERROR: unsupported Unicode escape sequence
DETAIL: \u0000 cannot be converted to text.
CONTEXT: JSON data, line 1: { "key2": "\u0000...
However, it appears that was released as v3.5.6.
i was able to reproduce this on master with postgres db and below workflow (click submit after creating the cron):
apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
name: test-cron-wfnull
spec:
schedule: "* * * * *"
concurrencyPolicy: "Forbid"
startingDeadlineSeconds: 0
workflowSpec:
entrypoint: whalesay
templates:
- name: whalesay
container:
image: alpine:3.6
command: [sh, -c]
env: null
args: ["date; echo '\x00'"]
in my case, since its a new installation, the workflow would run but fail to get archived:
Query: INSERT INTO "argo_archived_workflows" ("clustername", "finishedat", "instanceid", "name", "namespace", "phase", "startedat", "uid", "workflow") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9) RETURNING "clustername", "uid"
controller: Arguments: []interface {}{"default", time.Date(2024, time.November, 13, 5, 57, 14, 0, time.Local), "", "test-cron-wfnull-1731437820", "argo", "Failed", time.Date(2024, time.November, 13, 5, 57, 0, 0, time.Local), "c61e460c-47d3-46b2-aa16-b6c28c728567", "{\"kind\":\"Workflow\",\"apiVersion\":\"argoproj.io/v1alpha1\",\"metadata\":{\"name\":\"test-cron-wfnull-1731437820\",\"namespace\":\"argo\",\"uid\":\"c61e460c-47d3-46b2-aa16-b6c28c728567\",\"resourceVersion\":\"96738\",\"generation\":5,\"creationTimestamp\":\"2024-11-12T18:57:00Z\",\"labels\":{\"workflows.argoproj.io/completed\":\"true\",\"workflows.argoproj.io/cron-workflow\":\"test-cron-wfnull\",\"workflows.argoproj.io/phase\":\"Failed\",\"workflows.argoproj.io/workflow-archiving-status\":\"Persisted\"},\"annotations\":{\"workflows.argoproj.io/pod-name-format\":\"v2\",\"workflows.argoproj.io/scheduled-time\":\"2024-11-13T05:57:00+11:00\"},\"ownerReferences\":[{\"apiVersion\":\"argoproj.io/v1alpha1\",\"kind\":\"CronWorkflow\",\"name\":\"test-cron-wfnull\",\"uid\":\"3352f4c0-a2f9-4dae-8d1e-7ce2647c3cbb\",\"controller\":true,\"blockOwnerDeletion\":true}],\"managedFields\":[{\"manager\":\"workflow-controller\",\"operation\":\"Update\",\"apiVersion\":\"argoproj.io/v1alpha1\",\"time\":\"2024-11-12T18:57:14Z\",\"fieldsType\":\"FieldsV1\",\"fieldsV1\":{\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:workflows.argoproj.io/pod-name-format\":{},\"f:workflows.argoproj.io/scheduled-time\":{}},\"f:labels\":{\".\":{},\"f:workflows.argoproj.io/completed\":{},\"f:workflows.argoproj.io/cron-workflow\":{},\"f:workflows.argoproj.io/phase\":{},\"f:workflows.argoproj.io/workflow-archiving-status\":{}},\"f:ownerReferences\":{\".\":{},\"k:{\\\"uid\\\":\\\"3352f4c0-a2f9-4dae-8d1e-7ce2647c3cbb\\\"}\":{}}},\"f:spec\":{},\"f:status\":{}}}]},\"spec\":{\"templates\":[{\"name\":\"whalesay\",\"inputs\":{},\"outputs\":{},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"alpine:3.6\",\"command\":[\"sh\",\"-c\"],\"args\":[\"date; echo '\\u0000'\"],\"resources\":{}}}],\"entrypoint\":\"whalesay\",\"arguments\":{},\"activeDeadlineSeconds\":300,\"podSpecPatch\":\"terminationGracePeriodSeconds: 3\\n\"},\"status\":{\"phase\":\"Failed\",\"startedAt\":\"2024-11-12T18:57:00Z\",\"finishedAt\":\"2024-11-12T18:57:14Z\",\"estimatedDuration\":109,\"progress\":\"0/1\",\"message\":\"main: Error (exit code 1)\",\"nodes\":{\"test-cron-wfnull-1731437820\":{\"id\":\"test-cron-wfnull-1731437820\",\"name\":\"test-cron-wfnull-1731437820\",\"displayName\":\"test-cron-wfnull-1731437820\",\"type\":\"Pod\",\"templateName\":\"whalesay\",\"templateScope\":\"local/test-cron-wfnull-1731437820\",\"phase\":\"Failed\",\"message\":\"main: Error (exit code 1)\",\"startedAt\":\"2024-11-12T18:57:00Z\",\"finishedAt\":\"2024-11-12T18:57:10Z\",\"progress\":\"0/1\",\"resourcesDuration\":{\"cpu\":0,\"memory\":4},\"outputs\":{\"exitCode\":\"1\"},\"hostNodeName\":\"minikube\"}},\"conditions\":[{\"type\":\"PodRunning\",\"status\":\"False\"},{\"type\":\"Completed\",\"status\":\"True\"}],\"resourcesDuration\":{\"cpu\":0,\"memory\":4},\"artifactRepositoryRef\":{\"configMap\":\"artifact-repositories\",\"key\":\"default-v1\",\"namespace\":\"argo\",\"artifactRepository\":{\"archiveLogs\":true,\"s3\":{\"endpoint\":\"minio:9000\",\"bucket\":\"my-bucket\",\"insecure\":true,\"accessKeySecret\":{\"name\":\"my-minio-cred\",\"key\":\"accesskey\"},\"secretKeySecret\":{\"name\":\"my-minio-cred\",\"key\":\"secretkey\"}}}},\"artifactGCStatus\":{\"notSpecified\":true},\"taskResultsCompletionStatus\":{\"test-cron-wfnull-1731437820\":true}}}"}
controller: Stack:
controller: fmt.(*pp).handleMethods@/usr/local/go/src/fmt/print.go:673
controller: fmt.(*pp).printArg@/usr/local/go/src/fmt/print.go:756
controller: fmt.(*pp).doPrint@/usr/local/go/src/fmt/print.go:1208
controller: time="2024-11-13T05:57:14.660Z" level=debug msg="Patch pods 200"
controller: fmt.Append@/usr/local/go/src/fmt/print.go:289
controller: log.(*Logger).Print.func1@/usr/local/go/src/log/log.go:261
controller: log.(*Logger).output@/usr/local/go/src/log/log.go:238
controller: log.(*Logger).Print@/usr/local/go/src/log/log.go:260
controller: github.com/argoproj/argo-workflows/v3/persist/sqldb.(*workflowArchive).ArchiveWorkflow.func1@/redact/argo-workflows-main/persist/sqldb/workflow_archive.go:116
controller: github.com/argoproj/argo-workflows/v3/persist/sqldb.(*workflowArchive).ArchiveWorkflow@/redact/argo-workflows-main/persist/sqldb/workflow_archive.go:106
controller: github.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).archiveWorkflowAux@/redact/argo-workflows-main/workflow/controller/controller.go:1202
controller: github.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).archiveWorkflow@/redact/argo-workflows-main/workflow/controller/controller.go:1182
controller: time="2024-11-13T05:57:14.676Z" level=error msg="failed to archive workflow" error="failed to archive workflow: ERROR: unsupported Unicode escape sequence (SQLSTATE 22P05)" key=argo/test-cron-wfnull-1731437820
controller: github.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).processNextArchiveItem@/redact/argo-workflows-main/workflow/controller/controller.go:926
controller: github.com/argoproj/argo-workflows/v3/workflow/controller.(*WorkflowController).runArchiveWorker@/redact/argo-workflows-main/workflow/controller/controller.go:822
controller: k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1@/redact/go/pkg/mod/k8s.io/apimachinery@v0.30.3/pkg/util/wait/backoff.go:226
controller: k8s.io/apimachinery/pkg/util/wait.BackoffUntil@/redact/go/pkg/mod/k8s.io/apimachinery@v0.30.3/pkg/util/wait/backoff.go:227
controller: k8s.io/apimachinery/pkg/util/wait.JitterUntil@/redact/go/pkg/mod/k8s.io/apimachinery@v0.30.3/pkg/util/wait/backoff.go:204
controller: Error: ERROR: unsupported Unicode escape sequence (SQLSTATE 22P05)
controller: Time taken: 0.01704s
controller: Context: context.Background
as you can see the \x00 from the workflow is being converted to \u0000 in the json. probably by https://github.com/argoproj/argo-workflows/blob/217b598684c6d0cb9384e8c649f8e73659c5f9e5/persist/sqldb/workflow_archive.go#L102
so i think few things would need to happen:
Pre-requisites
:latest
image tag (i.e.quay.io/argoproj/workflow-controller:latest
) and can confirm the issue still exists on:latest
. If not, I have explained why, in detail, in my description below.What happened? What did you expect to happen?
After upgrading from v3.5.6 to v3.5.11, we see the following error in the Argo UI whenever trying to list
Succeeded
orFailed
workflows:argo-server logs show the following:
Note also the following:
Our DB for archived workflows is Postgres 9.6.
We have the following defined in
workflow-controller-configmap
Version(s)
v3.5.11
Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
This is in our production environment immediately after updating to 3.5.11
Logs from the workflow controller
Logs from in your workflow's wait container