aws / aws-sdk-go-v2

AWS SDK for the Go programming language.
https://aws.github.io/aws-sdk-go-v2/docs/
Apache License 2.0
2.68k stars 651 forks source link

Reproduction of #1825 (use of closed network connection) with trace #2737

Open chrisr-spectra opened 3 months ago

chrisr-spectra commented 3 months ago

Acknowledgements

Describe the bug

This is a reproduction of #1825 with the tracing suggested by @lucix-aws

2024-08-07T03:26:57.822Z !!! READ ON CLIENT-CLOSED CONN !!!
2024-08-07T03:26:57.822Z goroutine 191 [running]:
2024-08-07T03:26:57.822Z runtime/debug.Stack()
2024-08-07T03:26:57.822Z /usr/local/go/src/runtime/debug/stack.go:24 +0x5e
2024-08-07T03:26:57.822Z github.com/---OurCode---/dynclient.(*tracedConn).Close(0xc0008bbbc0)
2024-08-07T03:26:57.822Z /go/pkg/mod/github.com/---OurCode---/client.go:53 +0x18
2024-08-07T03:26:57.822Z crypto/tls.(*Conn).Close(0xc00027d180?)
2024-08-07T03:26:57.822Z /usr/local/go/src/crypto/tls/conn.go:1428 +0xc8
2024-08-07T03:26:57.822Z net/http.(*persistConn).closeLocked(0xc0009c6000, {0x1d177e0, 0x2707f80})
2024-08-07T03:26:57.822Z /usr/local/go/src/net/http/transport.go:2746 +0x13a
2024-08-07T03:26:57.822Z net/http.(*persistConn).close(0xc00016d900?, {0x1d177e0?, 0x2707f80?})
2024-08-07T03:26:57.822Z /usr/local/go/src/net/http/transport.go:2731 +0xa8
2024-08-07T03:26:57.822Z net/http.(*persistConn).writeLoop(0xc0009c6000)
2024-08-07T03:26:57.822Z /usr/local/go/src/net/http/transport.go:2447 +0x29b
2024-08-07T03:26:57.822Z created by net/http.(*Transport).dialConn in goroutine 198
2024-08-07T03:26:57.822Z /usr/local/go/src/net/http/transport.go:1777 +0x16f1
2024-08-07T03:26:57.822Z !!! -(end trace)-------------- !!!

Expected Behavior

dynamodb Client.Query to return without error

Current Behavior

dynamodb Client.Query returns error:

operation error DynamoDB: Query, https response error StatusCode: 200, RequestID: GUR48MP1C12O956O84J2H67F7NVV4KQNSO5AEMVJF66Q9ASUAAJG, deserialization failed, failed to decode response body, read tcp 169.254.76.1:56796-\u003e52.94.0.72:443: use of closed network connection"}

Reproduction Steps

result, err := db.Query(
        ctx,
        &dynamodb.QueryInput{
                TableName:                 aws.String(name),
                ConsistentRead:            aws.Bool(true),
                Limit:                     aws.Int32(count),
                KeyConditionExpression:    aws.String(keyConditionExpression),
                ExpressionAttributeValues: expressionAttributeValues,
                ExclusiveStartKey:         exclusiveStartKey,
        },
)

Possible Solution

No response

Additional Information/Context

No response

AWS Go SDK V2 Module Versions Used

    github.com/aws/aws-lambda-go v1.47.0
    github.com/aws/aws-sdk-go-v2 v1.27.1
    github.com/aws/aws-sdk-go-v2/service/s3 v1.54.4
    github.com/aws/aws-sdk-go-v2/service/sns v1.29.9
    github.com/aws/smithy-go v1.20.2
    github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.6.2 // indirect
    github.com/aws/aws-sdk-go-v2/config v1.27.17 // indirect
    github.com/aws/aws-sdk-go-v2/credentials v1.17.17 // indirect
    github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.16.4 // indirect
    github.com/aws/aws-sdk-go-v2/internal/configsources v1.3.8 // indirect
    github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.6.8 // indirect
    github.com/aws/aws-sdk-go-v2/internal/ini v1.8.0 // indirect
    github.com/aws/aws-sdk-go-v2/internal/v4a v1.3.8 // indirect
    github.com/aws/aws-sdk-go-v2/service/dynamodb v1.32.7 // indirect
    github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.11.2 // indirect
    github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.3.10 // indirect
    github.com/aws/aws-sdk-go-v2/service/internal/endpoint-discovery v1.9.9 // indirect
    github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.11.10 // indirect
    github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.17.8 // indirect
    github.com/aws/aws-sdk-go-v2/service/kms v1.32.2 // indirect
    github.com/aws/aws-sdk-go-v2/service/sqs v1.32.4 // indirect
    github.com/aws/aws-sdk-go-v2/service/sso v1.20.10 // indirect
    github.com/aws/aws-sdk-go-v2/service/ssooidc v1.24.4 // indirect
    github.com/aws/aws-sdk-go-v2/service/sts v1.28.11 // indirect

Compiler and Version used

go version go1.21.3 linux/amd64

Operating System and version

lambda runtime provided.al2

zapisanchez commented 3 months ago

same here

goroutine 893760 [running]: runtime/debug.Stack() /usr/local/go/src/runtime/debug/stack.go:24 +0x5e parser/cmd/parser/storage.(tracedConn).Close(0xc00070e840) /go/src/cmd/parser/storage/traceconn.go:40 +0x18 crypto/tls.(Conn).Close(0xc0000e8dd0?) /usr/local/go/src/crypto/tls/conn.go:1429 +0xc8 net/http.(persistConn).closeLocked(0xc000745200, {0xce89a0, 0x1101eb0}) /usr/local/go/src/net/http/transport.go:2783 +0x13a net/http.(persistConn).close(0xc0002e08c0?, {0xce89a0?, 0x1101eb0?}) /usr/local/go/src/net/http/transport.go:2768 +0xa8 net/http.(*persistConn).closeConnIfStillIdle(0xc000745200) /usr/local/go/src/net/http/transport.go:2040 +0xbe created by time.goFunc /usr/local/go/src/time/sleep.go:177 +0x2d

chrisr-spectra commented 3 months ago

For what it's worth, I actually get two different stack traces when I run with the tracing code. The one I provided in the description matches the time I saw the closed network connection error, and happens very infrequently. Hover, I frequently get the trace:

2024-08-07T03:26:22.589Z !!! READ ON CLIENT-CLOSED CONN !!!
2024-08-07T03:26:22.589Z goroutine 181 [running]:
2024-08-07T03:26:22.589Z runtime/debug.Stack()
2024-08-07T03:26:22.589Z /usr/local/go/src/runtime/debug/stack.go:24 +0x5e
2024-08-07T03:26:22.589Z github.com/---OurCode---/dynclient.(*tracedConn).Close(0xc00021a040)
2024-08-07T03:26:22.589Z /go/pkg/mod/github.com/---OurCode---/client.go:53 +0x18
2024-08-07T03:26:22.589Z crypto/tls.(*Conn).Close(0xc00027d180?)
2024-08-07T03:26:22.589Z /usr/local/go/src/crypto/tls/conn.go:1428 +0xc8
2024-08-07T03:26:22.589Z net/http.(*persistConn).closeLocked(0xc000672240, {0x1d177e0, 0x2708ba0})
2024-08-07T03:26:22.589Z /usr/local/go/src/net/http/transport.go:2746 +0x13a
2024-08-07T03:26:22.589Z net/http.(*persistConn).close(0xc00027d180?, {0x1d177e0?, 0x2708ba0?})
2024-08-07T03:26:22.589Z /usr/local/go/src/net/http/transport.go:2731 +0xa8
2024-08-07T03:26:22.589Z net/http.(*persistConn).closeConnIfStillIdle(0xc000672240)
2024-08-07T03:26:22.589Z /usr/local/go/src/net/http/transport.go:2017 +0xbe
2024-08-07T03:26:22.589Z created by time.goFunc
2024-08-07T03:26:22.589Z /usr/local/go/src/time/sleep.go:176 +0x2d
2024-08-07T03:26:22.589Z !!! -(end trace)-------------- !!!

but it doesn't seem to correspond to an error in my client code.

lucix-aws commented 3 months ago

TYVM for taking the time to repro. I'll try to deep dive this week or early next.

lucix-aws commented 3 months ago

These traces certainly appear to demonstrate that it's net/http that's ultimately closing the conns out from under us. Whether we indirectly caused said close remains to be seen.