temporalio / temporal

Temporal service
https://docs.temporal.io
MIT License
12.1k stars 850 forks source link

Unable to update Elasticsearch mapping: elastic: Error 403 on Create Custom Attributes #4773

Closed DodgeCamaro closed 1 year ago

DodgeCamaro commented 1 year ago

Expected Behavior

Created Custom Attribute

Actual Behavior

Error: unable to add search attributes: System Workflow with WorkflowId temporal-sys-add-search-attributes-workflow and RunId 10fdc88f-8ef7-4936-b06d-f309a836125d returned an error: workflow execution error (type: temporal-sys-add-search-attributes-workflow, workflowID: temporal-sys-add-search-attributes-workflow, runID: 10fdc88f-8ef7-4936-b06d-f309a836125d): unable to execute activity: AddESMappingFieldActivity: activity error (type: AddESMappingFieldActivity, scheduledEventID: 5, startedEventID: 6, identity: 1@temporal-worker-7cfc5bf56d-ll9lj@): unable to update Elasticsearch mapping: elastic: Error 403 (Forbidden): no permissions for [] and User [name=temporal, backend_roles=[], requestedTenant=null] [type=security_exception] (type: wrapError, retryable: true): unable to execute activity

Steps to Reproduce the Problem

Same error with two command below

  1. temporal operator search-attribute create --name CustomTest --type Keyword
  2. tctl admin cl asa --name CustomTest --type Keyword

    Specifications

    • Version: 1.20.4
    • Platform: Temporal Self-Hosted Advanced Visibility AWS Opensearch 2.5 + AWS RDS Postgres 13.8
rodrigozhou commented 1 year ago

@DodgeCamaro Can you check if the Elasticsearch user temporal has the right permissions to update the mapping? Please take a look at this page.

DodgeCamaro commented 1 year ago

@rodrigozhou Hi. I've used OpenSearch master user with full access to the cluster image

rodrigozhou commented 1 year ago

Can you share your visibility config?

DodgeCamaro commented 1 year ago

@rodrigozhou Hello, sure!

global:
    membership:
        maxJoinDuration: 30s
        broadcastAddress: '10.0.3.79'
    pprof:
        port: 0
    tls:
        internode:
            client:
                serverName: ""
                disableHostVerification: false
                rootCaFiles: []
                rootCaData: []
                forceTLS: false
            server:
                certFile: ""
                keyFile: ""
                clientCaFiles: []
                certData: ""
                keyData: ""
                clientCaData: []
                requireClientAuth: false
            hostOverrides: {}
        frontend:
            client:
                serverName: ""
                disableHostVerification: false
                rootCaFiles: []
                rootCaData: []
                forceTLS: false
            server:
                certFile: ""
                keyFile: ""
                clientCaFiles: []
                certData: ""
                keyData: ""
                clientCaData: []
                requireClientAuth: false
            hostOverrides: {}
        systemWorker:
            certFile: ""
            keyFile: ""
            certData: ""
            keyData: ""
            client:
                serverName: ""
                disableHostVerification: false
                rootCaFiles: []
                rootCaData: []
                forceTLS: false
        remoteClusters: {}
        expirationChecks:
            warningWindow: 0s
            errorWindow: 0s
            checkInterval: 0s
        refreshInterval: 0s
    metrics:
        tags:
            type: 'frontend'
        excludeTags: {}
        prefix: ""
        perUnitHistogramBoundaries: {}
        m3: null
        statsd: null
        prometheus:
            framework: ""
            listenAddress: 0.0.0.0:9090
            handlerPath: ""
            listenNetwork: ""
            timerType: histogram
            defaultHistogramBoundaries: []
            defaultHistogramBuckets: []
            defaultSummaryObjectives: []
            onError: ""
            sanitizeOptions: null
    authorization:
        jwtKeyProvider:
            keySourceURIs: []
            refreshInterval: 0s
        permissionsClaimName: ""
        authorizer: ""
        claimMapper: ""
persistence:
    defaultStore: default
    visibilityStore: visibility
    secondaryVisibilityStore: ""
    advancedVisibilityStore: advancedVisibility
    numHistoryShards: 4
    datastores:
        advancedVisibility:
            faultInjection: null
            cassandra: null
            sql: null
            customDatastore: null
            elasticsearch:
                version: v7
                url:
                    scheme: https
                    opaque: ""
                    user: null
                    host: opensearch
                    path: ""
                    rawpath: ""
                    omithost: false
                    forcequery: false
                    rawquery: ""
                    fragment: ""
                    rawfragment: ""
                username: temporal
                password: 'password'
                indices:
                    visibility: temporal_visibility
                logLevel: ""
                aws-request-signing:
                    enabled: false
                    region: ""
                    credentialProvider: ""
                    static:
                        accessKeyID: ""
                        secretAccessKey: ""
                        token: ""
                closeIdleConnectionsInterval: 0s
                enableSniff: false
                enableHealthcheck: false
        default:
            faultInjection: null
            cassandra: null
            sql:
                user: temporal
                password: 'password'
                pluginName: postgres
                databaseName: temporal
                connectAddr: temporal:5432
                connectProtocol: tcp
                connectAttributes: {}
                maxConns: 0
                maxIdleConns: 0
                maxConnLifetime: 0s
                taskScanPartitions: 0
                tls: null
            customDatastore: null
            elasticsearch: null
        visibility:
            faultInjection: null
            cassandra: null
            sql:
                user: temporal
                password: 'password'
                pluginName: postgres
                databaseName: temporal_visibility
                connectAddr: temporal:5432
                connectProtocol: tcp
                connectAttributes: {}
                maxConns: 0
                maxIdleConns: 0
                maxConnLifetime: 0s
                taskScanPartitions: 0
                tls: null
            customDatastore: null
            elasticsearch: null
log:
    stdout: true
    level: info
    outputFile: ""
    format: ""
    development: false
clusterMetadata:
    enableGlobalNamespace: false
    failoverVersionIncrement: 10
    masterClusterName: temporal
    currentClusterName: temporal
    clusterInformation:
        temporal:
            enabled: true
            initialFailoverVersion: 1
            rpcAddress: 127.0.0.1:7233
dcRedirectionPolicy:
    policy: ""
services:
    frontend:
        rpc:
            grpcPort: 7233
            membershipPort: 6933
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
    history:
        rpc:
            grpcPort: 7234
            membershipPort: 6934
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
    matching:
        rpc:
            grpcPort: 7235
            membershipPort: 6935
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
    worker:
        rpc:
            grpcPort: 7239
            membershipPort: 6939
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
archival:
    history:
        state: enabled
        enableRead: true
        provider:
            filestore: null
            gstorage: null
            s3store:
                region: us-east-2
                endpoint: null
                s3ForcePathStyle: false
    visibility:
        state: enabled
        enableRead: true
        provider:
            filestore: null
            s3store:
                region: us-east-2
                endpoint: null
                s3ForcePathStyle: false
            gstorage: null
publicClient:
    hostPort: ""
    forceTLSConfig: ""
dynamicConfigClient:
    filepath: /etc/temporal/config/dynamic_config.yaml
    pollInterval: 10s
namespaceDefaults:
    archival:
        history:
            state: enabled
            URI: s3://<bucket_name>
        visibility:
            state: enabled
            URI: s3://<bucket_name>
otel: {}
rodrigozhou commented 1 year ago

@DodgeCamaro I can see you're using dual visibility setting, ie., setting up two visibiility storages. One using Postgres and another using Elasticsearch. Is that on purpose?

Adding search attributes while in dual visibility mode is not supported (WIP).

DodgeCamaro commented 1 year ago

@rodrigozhou This doesn't work with AWS RDS + AWS OpenSearch docker.yaml below with postgres and opensearch setup at cluster.

global:
    membership:
        maxJoinDuration: 30s
        broadcastAddress: '10.0.2.30'
    pprof:
        port: 0
    tls:
        internode:
            client:
                serverName: ""
                disableHostVerification: false
                rootCaFiles: []
                rootCaData: []
                forceTLS: false
            server:
                certFile: ""
                keyFile: ""
                clientCaFiles: []
                certData: ""
                keyData: ""
                clientCaData: []
                requireClientAuth: false
            hostOverrides: {}
        frontend:
            client:
                serverName: ""
                disableHostVerification: false
                rootCaFiles: []
                rootCaData: []
                forceTLS: false
            server:
                certFile: ""
                keyFile: ""
                clientCaFiles: []
                certData: ""
                keyData: ""
                clientCaData: []
                requireClientAuth: false
            hostOverrides: {}
        systemWorker:
            certFile: ""
            keyFile: ""
            certData: ""
            keyData: ""
            client:
                serverName: ""
                disableHostVerification: false
                rootCaFiles: []
                rootCaData: []
                forceTLS: false
        remoteClusters: {}
        expirationChecks:
            warningWindow: 0s
            errorWindow: 0s
            checkInterval: 0s
        refreshInterval: 0s
    metrics: null
    authorization:
        jwtKeyProvider:
            keySourceURIs: []
            refreshInterval: 0s
        permissionsClaimName: ""
        authorizer: ""
        claimMapper: ""
persistence:
    defaultStore: default
    visibilityStore: visibility
    secondaryVisibilityStore: ""
    advancedVisibilityStore: advancedVisibility
    numHistoryShards: 3
    datastores:
        advancedVisibility:
            faultInjection: null
            cassandra: null
            sql: null
            customDatastore: null
            elasticsearch:
                version: v7
                url:
                    scheme: http
                    opaque: ""
                    user: null
                    host: opensearch:9200
                    path: ""
                    rawpath: ""
                    omithost: false
                    forcequery: false
                    rawquery: ""
                    fragment: ""
                    rawfragment: ""
                username: temporal
                password: 'pass'
                indices:
                    visibility: temporal_visibility
                logLevel: ""
                aws-request-signing:
                    enabled: false
                    region: ""
                    credentialProvider: ""
                    static:
                        accessKeyID: ""
                        secretAccessKey: ""
                        token: ""
                closeIdleConnectionsInterval: 0s
                enableSniff: false
                enableHealthcheck: false
        default:
            faultInjection: null
            cassandra: null
            sql:
                user: temporal
                password: 'pass'
                pluginName: postgres
                databaseName: temporal
                connectAddr: postgres:5432
                connectProtocol: tcp
                connectAttributes: {}
                maxConns: 0
                maxIdleConns: 0
                maxConnLifetime: 0s
                taskScanPartitions: 0
                tls: null
            customDatastore: null
            elasticsearch: null
        visibility:
            faultInjection: null
            cassandra: null
            sql:
                user: temporal
                password: 'pass'
                pluginName: postgres
                databaseName: temporal_visibility
                connectAddr: postgres:5432
                connectProtocol: tcp
                connectAttributes: {}
                maxConns: 0
                maxIdleConns: 0
                maxConnLifetime: 0s
                taskScanPartitions: 0
                tls: null
            customDatastore: null
            elasticsearch: null
log:
    stdout: true
    level: info
    outputFile: ""
    format: ""
    development: false
clusterMetadata:
    enableGlobalNamespace: false
    failoverVersionIncrement: 10
    masterClusterName: temporal
    currentClusterName: temporal
    clusterInformation:
        temporal:
            enabled: true
            initialFailoverVersion: 1
            rpcAddress: 127.0.0.1:7233
dcRedirectionPolicy:
    policy: ""
services:
    frontend:
        rpc:
            grpcPort: 7233
            membershipPort: 6933
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
    history:
        rpc:
            grpcPort: 7234
            membershipPort: 6934
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
    matching:
        rpc:
            grpcPort: 7235
            membershipPort: 6935
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
    worker:
        rpc:
            grpcPort: 7239
            membershipPort: 6939
            bindOnLocalHost: false
            bindOnIP: 0.0.0.0
archival:
    history:
        state: enabled
        enableRead: true
        provider:
            filestore: null
            gstorage: null
            s3store:
                region: us-east-2
                endpoint: null
                s3ForcePathStyle: false
    visibility:
        state: enabled
        enableRead: true
        provider:
            filestore: null
            s3store:
                region: us-east-2
                endpoint: null
                s3ForcePathStyle: false
            gstorage: null
publicClient:
    hostPort: ""
    forceTLSConfig: ""
dynamicConfigClient:
    filepath: /etc/temporal/config/dynamic_config.yaml
    pollInterval: 10s
namespaceDefaults:
    archival:
        history:
            state: enabled
            URI: s3://bucket
        visibility:
            state: enabled
            URI: s3://bucket
otel: {}
rodrigozhou commented 1 year ago

@DodgeCamaro This is the important part:

persistence:
    ...
    visibilityStore: visibility
    secondaryVisibilityStore: ""
    advancedVisibilityStore: advancedVisibility
    ...

The key visibilityStore defines the visibility store and it looks like it's pointing to a Postgres DB (AWS RDS instance). And you also set advancedVisibilityStore which sets Elasticsearch (AWS OpenSearch instance) also as visibility store.

Since you set two visibility stores, you are in dual visibility mode which does not support adding search attributes well. This feature was initially designed to support migration of visibility stores (eg: Postgres to Elasticsearch).

DodgeCamaro commented 1 year ago

But how it works? When I deploy temporal with Postgres and OpenSearch at AWS EKS cluster, I can add search attributes. When using AWS OpenSearch and AWS RDS Postgres, I can't add search attributes.

Visibility Store and Secondary Visibility Store returned same error

rodrigozhou commented 1 year ago

Are you using the exactly the same config? For the visibilityStore: visibility, the plugin name is postgres which does not support search attributes, so you should not be able to add in either cases.

If you don't need two visibility stores, I'd suggest to remove the Postgres. Basically, remove the keys visibilityStore and secondaryVisibilityStore. Also, remove the datastores.visibility section.

persistence:
    ...
    advancedVisibilityStore: advancedVisibility
    ...
    datastores:
        advancedVisibility:
            ...
        default:
            ...
DodgeCamaro commented 1 year ago

Could I migrate advancedVisibilityStore (OpenSearch) to visibilityStore (Postgres)? I have a lot of workflows at production, using visibilityStore and advancedVisibilityStore

rodrigozhou commented 1 year ago

You cannot. Based on your settings, your Postgres setup is not supporting custom search attributes, so if you are already using custom search attributes, they are not being written to Postgres. Also, based on those settings, I think you're writing to both Postgres and Elasticsearch, so if you just remove the visibilityStore config key as I suggested in the comment before, you won't be losing any data.

DodgeCamaro commented 1 year ago

After disabling advancedVisibilityStore I've lost all active workflow and schedules. It's storing at OpenSearch

jingyi2318 commented 1 year ago

Hey! I was reading this issue since we ran into a similar error message that's very misleading. What worked for us was change the config template to only set one visibility depending on what values were set rather than completely disabling advancedVisibilityStore (example here, more detailed thread here). Not sure if y'all have tried this already, but after we made this change the error is fixed!

DodgeCamaro commented 1 year ago

@rodrigozhou This doc references to use AdvancedVisibility for creating and use Custom Search Attributes.

At this configuration adding Custom Search Attributes, works well. https://github.com/temporalio/docker-compose/blob/main/docker-compose-postgres-opensearch.yml

rodrigozhou commented 1 year ago

Sorry for late reply.

@DodgeCamaro I didn't mean to say to remove or disable advancedVisibilityStore. As I shared above, I meant exactly to only keep advancedVisibilityStore, and remove visibilityStore and secondaryVisibilityStore. I'd also suggest taking a look at the v1.21.0 release notes that announced changes to visibility config keys.