dhiaayachi / temporal

Temporal service
https://docs.temporal.io
MIT License
0 stars 0 forks source link

temporal-history: pq: unsupported jsonb version number 123 #388

Open dhiaayachi opened 2 months ago

dhiaayachi commented 2 months ago

Expected Behavior

Log behavior: temporal-history visibility queue processor can process JSONB fields. UI behavior: Actually completed workflows are shown as Completed in both the workflow list and details UIs.

Actual Behavior

Logged behavior: temporal-history visibility queue processor apparently fails to read rows containing JSONB fields, failing with the error: pq: unsupported jsonb version number 123.

In my limited spot checking, Temporal appears otherwise functional. Hello world and more realistic workflows complete as expected.

Logging example:

{"level":"error","ts":"2023-04-18T21:24:12.940Z","msg":"Operation failed with an error.","error":"pq: unsupported jsonb version number 123","logging-call-at":"visiblity_manager_metrics.go:258","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error                                                                                               /home/runner/work/temporal/temporal/common/log/zap_logger.go:150
go.temporal.io/server/common/persistence/visibility.(*visibilityManagerMetrics).updateErrorMetric
    /home/runner/work/temporal/temporal/common/persistence/visibility/visiblity_manager_metrics.go:258
go.temporal.io/server/common/persistence/visibility.(*visibilityManagerMetrics).RecordWorkflowExecutionClosed
    /home/runner/work/temporal/temporal/common/persistence/visibility/visiblity_manager_metrics.go:102
go.temporal.io/server/service/history.(*visibilityQueueTaskExecutor).recordCloseExecution
    /home/runner/work/temporal/temporal/service/history/visibilityQueueTaskExecutor.go:443
go.temporal.io/server/service/history.(*visibilityQueueTaskExecutor).processCloseExecution
    /home/runner/work/temporal/temporal/service/history/visibilityQueueTaskExecutor.go:392
go.temporal.io/server/service/history.(*visibilityQueueTaskExecutor).Execute
    /home/runner/work/temporal/temporal/service/history/visibilityQueueTaskExecutor.go:120
go.temporal.io/server/service/history/queues.(*executableImpl).Execute
    /home/runner/work/temporal/temporal/service/history/queues/executable.go:203
go.temporal.io/server/common/tasks.(*FIFOScheduler[...]).executeTask.func1
    /home/runner/work/temporal/temporal/common/tasks/fifo_scheduler.go:231
go.temporal.io/server/common/backoff.ThrottleRetry.func1
    /home/runner/work/temporal/temporal/common/backoff/retry.go:175
go.temporal.io/server/common/backoff.ThrottleRetryContext
    /home/runner/work/temporal/temporal/common/backoff/retry.go:199
go.temporal.io/server/common/backoff.ThrottleRetry
    /home/runner/work/temporal/temporal/common/backoff/retry.go:176
go.temporal.io/server/common/tasks.(*FIFOScheduler[...]).executeTask
    /home/runner/work/temporal/temporal/common/tasks/fifo_scheduler.go:241
go.temporal.io/server/common/tasks.(*FIFOScheduler[...]).processTask
    /home/runner/work/temporal/temporal/common/tasks/fifo_scheduler.go:217"}

UI behavior: Actually completed workflows are shown as Running in the workflow list UI, but Completed in the details UI. I do not believe the root cause is related to https://github.com/temporalio/temporal/issues/888

Steps to Reproduce the Problem

  1. run temporal (see platform notes)
  2. run hello world workflow
  3. load workflow list UI
  4. compare to workflow detail UI

Specifications

TL;DR

The combination of Temporal 1.20.0+, visibility schema version 1.2+, pgbouncer run in transaction pooling mode, and temporal configured to use postgres with the binary_parameters flag set is apparently causing temporal-history to fail reading rows within the visibility queue processor. This somehow prevents the web UI from updating rows in the workflow list with the latest execution status.

Context

dhiaayachi commented 1 month ago

Thank you for reporting this issue. It appears that you're running into a compatibility issue between Temporal, PgBouncer, and the pq driver when using JSONB fields in your visibility store.

We are aware of this issue and are currently investigating it. In the meantime, here are a couple of possible workarounds that you can try:

  1. Use a different database: Consider migrating your visibility store to a different database that is known to work with Temporal and PgBouncer. Options include MySQL, PostgreSQL, or SQLite.
  2. Downgrade Temporal: You can try downgrading your Temporal Server version to one that is known to work with PgBouncer and JSONB fields. However, this may not be the ideal long-term solution, as it may introduce compatibility issues with other parts of your application.
  3. Update PgBouncer: Ensure you're running the latest version of PgBouncer, as there may be compatibility fixes released that address this issue.
  4. Use a different driver: While we recommend the pq driver, consider exploring alternative PostgreSQL drivers for your application, such as pgx. This might provide a workaround or offer better compatibility.

It's important to note that these workarounds may not be suitable for every situation. If you continue to experience issues, please provide more information about your configuration, and the specific error messages you receive. We're ready to help you investigate and resolve this problem.