Closed pdecat closed 3 months ago
Submitted a reproduction case: https://github.com/pdecat/steampipe-4233-repro
hi @pdecat sorry for the long delay on this one. I actually spent a bunch of time digging into this one to try and identify the underlying reasons, but I failed/ran out of time and had to move on to other stuff.
Came back to this today and I think it's a fine tactical fix for now. I'd still like to get to the bottom of it but I'll merge in the meantime.
Hi @kaidaguerre, no problem for the delay, been using my own fork until now :)
Thanks! :tada:
The issue described in the PR description reproduces with HEAD of v0.23.x branch with FDW 1.11.3:
2024-07-12 15:04:34.927 UTC [INFO] hub: goFdwReScanForeignScan, connection '', table 'aws_account'
2024-07-12 15:04:34.927 UTC [INFO] hub: goFdwBeginForeignScan, connection 'aws_000000000000', table 'aws_account', explain: false
2024-07-12 15:04:34.927 UTC [INFO] hub: goFdwBeginForeignScan, canPushdownAllSortFields true
2024-07-12 15:04:34.927 UTC [INFO] hub: goFdwBeginForeignScan, table 'aws_account', sortOrder: []
2024-07-12 15:04:34.927 UTC [INFO] hub: RemoteHub startScanForConnection 'aws_000000000000' limit -1
2024-07-12 15:04:34.947 UTC [WARN] hub: pluginHub.GetIterator FAILED: failed to attach to plugin process for 'aws': no connection config loaded for 'aws_111111111111'
2024-07-12 15:04:34.947 UTC [INFO] hub: buildConnectionLimitMap, table: 'aws_account', 1 connection, limit: -1
2024-07-12 15:04:34.947 UTC [INFO] hub: connection 'aws_222222222222', table 'aws_account', quals
----------------------------------------------------------------
Column: account_id, Operator: '=', Value: '222222222222'
----------------------------------------------------------------
2024-07-12 15:04:34.947 UTC [300] ERROR: failed to attach to plugin process for 'aws': no connection config loaded for 'aws_111111111111'
2024-07-12 15:04:34.947 UTC [300] STATEMENT: SELECT 'APIGATEWAY' AS subtype, row_to_json(r) AS data FROM (SELECT
[...SNIP...]
2024-07-12 15:04:34.948 UTC [INFO] hub: goFdwIterateForeignScan calling pluginHub.StartScan, table 'aws_account' Current timestamp: 774111873000000 (0xc000d0fc00)
2024-07-12 15:04:34.948 UTC [INFO] hub: hubBase StartScan
2024-07-12 15:04:34.948 UTC [INFO] hub: build executeConnectionData map: hub: 0xc0005da9b0, i.connectionLimitMap: map[aws_222222222222:-1]
2024-07-12 15:04:34.948 UTC [INFO] hub: build executeConnectionData map returning table:"aws_account" query_context:{columns:"account_id" quals:{key:"account_id" value:{quals:{fieldName:"account_id" string_value:"=" value:{string_value:"222222222222"}}}}} connection:"aws_222222222222" call_id:"1720796674139" trace_context:{value:"{}"} executeConnectionData:{key:"aws_222222222222" value:{cache_enabled:true cache_ttl:300}}
2024-07-12 15:04:34.948 UTC [INFO] hub: StartScan for table: aws_account, cache enabled: false, iterator 0xc000d0fc00, 1 quals (1720796674139)
2024-07-12 15:04:30.613 UTC [INFO] hub: goFdwAbortCallback
2024-07-12 15:04:34.948 UTC [INFO] hub: AddScanMetadata for iterator 0xc000a08700 query timestamp 774111852000000 (aws_000000000000)
2024-07-12 15:04:34.948 UTC [INFO] hub: removeStaleScanMetadata for current query timestamp 774111852000000
[...SNIP...]
2024-07-12 15:04:38.418 UTC [INFO] hub: .
******************************************************
steampipe postgres fdw shutdown
******************************************************
Using strings on the official v1.11.3 build of FDW, I can verify that my changes are not included as this first command should have returned something:
# strings .steampipe/db/14.2.0/postgres/lib/postgresql/steampipe_postgres_fdw.so | grep -F "[WARN] no connection config loaded for '"
If I remove the [WARN]
prefix, I see it is the old version:
# strings .steampipe/db/14.2.0/postgres/lib/postgresql/steampipe_postgres_fdw.so | grep -F "no connection config loaded for '"
accessed data from freed user arena runtime: wrong goroutine in newstackruntime: invalid pc-encoded table f=trace: non-empty global trace buffer[TRACE] plugin client created for %sno connection config loaded for '%s'positional arg index %d out of range444089209850062616169452667236328125ryuFtoaFixed64 called with prec > 180123456789abcdefghijklmnopqrstuvwxyzmethod ABI and value ABI don't alignreflect.Value.Equal: values of type Channel switches to new LB policy %qSubchannel Connectivity change to %vnil receiver passed to UnmarshalJSONgoogle/protobuf/source_context.protocannot parse '%s', %d overflows uintcannot parse '%s', %f overflows uintcompressed name in SRV resource datareflect: NumField of non-struct typehttp: no Location header in responsehttp: unexpected EOF reading trailerhttp: invalid byte %q in Cookie.Path LastStreamID=%v ErrCode=%v Debug=%qRoundTrip retrying after failure: %vno acceptable authentication methodsinvalid semicolon separator in queryA static map expression is required.can't setValue for a nil EvalContextInvalid template interpolation valueTemplateJoinExpr got non-tuple tupleMissing close bracket on splat indexcty.Capsule(%q, reflect.TypeOf(%#v))%d elements are required, but got %dunsupported value type %#v in Equalselement key for tuple must be numberwrong number of elements %d; need %d6ba7b810-9dad-11d1-80b4-00c04fd430c86ba7b811-9dad-11d1-80b4-00c04fd430c86ba7b812-9dad-11d1-80b4-00c04fd430c86ba7b814-9dad-11d1-80b4-00c04fd430c8%v: MessageSet with no unknown fieldinvalid gRPC request content-type %qpeer header list size exceeded limittransport: malformed grpc-status: %vtransport: malformed http-status: %vfailed to write the HTTP request: %vcannot merge into invalid %v messageduration (%v) has out-of-range nanosinvalid pattern syntax (+ after -): expected an ECDSA public key, got %TTLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHATLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHAtls: keys must have at least one keyunsupported SSLv2 handshake receivedtls: server did not send a key sharex509: zero or negative DSA parameterx509: invalid CRL distribution pointx509: invalid subject key identifierx509: malformed algorithm identifierx509: invalid X25519 private key: %vx509: serial number must be positive Owner: %d [%s] (current: %d [%s])
This PR resolves issues when connection configurations are created dynamically concurrently to executing queries.
Symptoms are that an unrelated connection that hasn't completed loading breaks other connections:
This problem seems to happen more frequently when connections are slower to load, most notably when limiters are configured.
See https://turbot-community.slack.com/archives/C01UECB59A7/p1712324097931309
For this change to be effective, the FDW has to be rebuilt with: