Open aidanmorgan opened 2 months ago
I have also tried changing the lambda to use a function url and directly invoking the lambda instead of using an async invocation from the API gateway, but get the same error.
Deploying the same code, but without a security group for attaching an EFS share appears to work, suspect there's some missing documentation about what permissions or ports are required for configuring the security groups?
Adding additional rules to the security group for ports 4317,4318) against the CIDR for my VPC still has the timeout issue.
I have also attempted to specify the OTEL_EXPORTER_OTLP_ENDPOINT
to point to http://localhost:4318
(and http://127.0.0.1:4318
) and updated the collector.yml
file appropriately to bind to that address, however still receiving timeout issues.
Attempting to send spans from a python 3.11 lambda are slow to start (significant pause after the collector starts) and then time out sending spans to X-Ray.
My lambda has an attached EFS mount, is running in a VPC and has a API gateway fronting it. The role my lambda is executing in has the
AWSXrayWriteOnlyAccess
profile attached to it.I have followed the instructions for instrumentation from the documentation.
The layer I am using is:
arn:aws:lambda:us-east-1:901920570463:layer:aws-otel-python-amd64-ver-1-25-0:1
This issue occurs when using a custom collector file, but also with the default that is provided in the layer. Configuration is set correctly according to the documentation (
AWS_LAMBDA_EXEC_WRAPPER
andOPENTELEMETRY_COLLECTOR_CONFIG_FILE
are set).Reading other forums I have tried increasing the memory available to the lambda with no luck.
I am accessing the trace context for my code as a global, using:
My lambda does create some sub-spans for capturing specific hotspots I am interested in, using code similar to:
collector.yml (which is the same as the documentation):
The full exception trace from the logs is: