getsentry / sentry-python

The official Python SDK for Sentry.io
https://sentry.io/for/python/
MIT License
1.86k stars 483 forks source link

Cannot use AWS X-Ray with lambda + flask integrations #476

Closed koleror closed 1 year ago

koleror commented 5 years ago

I'm trying to migrate from raven to sentry-python, with a stack consisting on a Zappa project to that deploys a lambda running Flask. Here is the configuration I used with raven:

from raven import Client
from raven.contrib.flask import Sentry
from raven.transport.requests import RequestsHTTPTransport

Sentry(
    application,
    logging=True,
    level=logging.ERROR,
    client=Client(<SENTRY_DSN>, transport=RequestsHTTPTransport),
)

Note that I had to use the synchronous RequestsHTTPTransport in order to let the sentry API calls to be done before the lambda shuts down.

Using the new python-sentry, I used the following configuration:

import sentry_sdk
from sentry_sdk.integrations.flask import FlaskIntegration
from sentry_sdk.integrations.aws_lambda import AwsLambdaIntegration

sentry_sdk.init(
    dsn=<SENTRY_DSN>,
    integrations=[FlaskIntegration(), AwsLambdaIntegration()],
    debug=True
)

Note that I also tried to reverse the integrations list, putting AwsLambdaIntegration first.

This works fine, as long as AWS X-Ray flask is turned off.

Here is my X-Ray configuration:

import boto3
import botocore
import requests
from aws_xray_sdk.core import xray_recorder, patch_all
from aws_xray_sdk.core.context import Context
from aws_xray_sdk.ext.flask.middleware import XRayMiddleware

patch_all()
xray_recorder.configure(
    service="my service",
    sampling=False,
    context=Context(),
    context_missing="LOG_ERROR",
)
XRayMiddleware(application, xray_recorder)  # application is my Flask application

Once X-Ray is turned on, I get this error every time Sentry tries to log an error:

aws_xray_sdk.core.exceptions.exceptions.SegmentNotFoundException: cannot find the current segment/subsegment, please make sure you have a segment open

From what I understood after debugging, the problem is the new sentry-python library runs asynchronously, even when using AWS lambda (which just waits for the threads to end), which means the Segment is closed at the end of the Flask request (here), instead of after the sentry requests are finished.

I believe the simpler way to fix this would be to make an option to fully disable threading, instead of just wait for threads to terminate, like in raven.

Do you have any idea how to fix this?

untitaker commented 5 years ago

It seems that X-Ray tries to capture information about the httplib invocation in our transport thread.

Could you try to add this line of code?

@app.after_request
def flush_sentry_events(_):
    sentry_sdk.Hub.current.client.flush()

Not sure if it needs to be before or after the xray middleware.

ezet commented 4 years ago

Any updates? I am getting the same issue running sentry and x-ray on Django.

Traceback:

File "/usr/local/lib/python3.7/site-packages/django/core/handlers/exception.py" in inner
  34.             response = get_response(request)

File "/usr/local/lib/python3.7/site-packages/sentry_sdk/integrations/django/middleware.py" in __call__
  131.             return f(*args, **kwargs)

File "/usr/local/lib/python3.7/site-packages/sentry_sdk/integrations/django/middleware.py" in sentry_wrapped_method
  92.                     return old_method(*args, **kwargs)

File "/usr/local/lib/python3.7/site-packages/aws_xray_sdk/ext/django/middleware.py" in __call__
  61.                 sampling=sampling_decision,

File "/usr/local/lib/python3.7/site-packages/aws_xray_sdk/core/recorder.py" in begin_segment
  216.             raise SegmentNameMissingException("Segment name is required.")
untitaker commented 4 years ago

@ezet that doesn't seem to be the same exact error. Could you try the same workaround I posted above?

sstoops commented 3 years ago

Following up on how I resolved this. After trying for a couple months to get these two to play nice together using the above code snippets, I ended up realizing I could just silence that logger. Sort of a hack workaround, but it's working for me. It's been a couple months since I implemented this, so hopefully I'm not mis-remembering the simplicity.

First, ensure you've switched your X-Ray configuration over to just log the missing context error with:

context_missing="LOG_ERROR",

Now that we've got it just logging the error, silence that logger!

# Silence the "no segment" logs that the X-Ray SDK spews
"aws_xray_sdk.core": {
    "handlers": ["null"],
    "propagate": False,
},

I've been running this in production for a couple months now with no issue. It's not perfect, but it let me start using x-ray.

EDIT: Combine this with my followup comment below.

untitaker commented 3 years ago

Thanks, this is a useful follow-up. Is the issue that those errors are then also reported to Sentry, or also that they appear in logs? We can start to ignore those errors in Sentry, but perhaps that only resolves a part of the issue.

sstoops commented 3 years ago

Shoot, you got me remembering that there was another part to this. I think silencing it killed it in my local log output, but I also needed to instruct Sentry to ignore it. Here is my Sentry setup code:

import sentry_sdk
from sentry_sdk.integrations.celery import CeleryIntegration
from sentry_sdk.integrations.django import DjangoIntegration
from sentry_sdk.integrations.logging import ignore_logger
from sentry_sdk.integrations.redis import RedisIntegration

def init_sentry():
    sentry_sdk.init(
        dsn="<redacted>",
        environment="<redacted>",
        integrations=[CeleryIntegration(), DjangoIntegration(), RedisIntegration()],
        release="<redacted>",
        send_default_pii=True,
    )

    ignore_logger("django_thumbor")
    ignore_logger("aws_xray_sdk.core.context")

So I believe, if I'm remembering correctly, I resolved this with my above two comments. One to switch it to log the errors instead of raise them, then silence the logs in my own log output and finally have Sentry ignore the logger as well.

untitaker commented 3 years ago

Ok. This generally seems like a bug in AWS, after all I would not expect aws x-ray to error when requests are being made while the lambda function is running, regardless of whether that is within the right request context or not.

We can choose to ignore aws x-ray errors in the sdk, but that would not suppress the log output and I suppose it would also raise more questions in case that logger would log things that we do want to report.

sstoops commented 3 years ago

Oh I'm sorry, I should also add, I am not running Flask in a lambda environment. I am running Django in an AWS ECS environment.

However, this issue sounded spot on to what I was seeing in my environment and I wanted to follow up in this thread.

Here's the full rundown of what I saw, which sounds pretty on par to what @koleror described above:

1) A Django request is received, processes and returns a response to the user as usual. 2) Since the AWS X-Ray implementation is done through a middleware, the segment is opened at the beginning of a request and closed at the end. 3) Sentry fires off a thread to send info back to the mothership. This thread executes OUTSIDE of the request/response lifecycle. 4) In the processing of the Sentry thread, any x-ray code that fires (http instrumentation, etc), won't have a segment to report to and throws the missing context error. 5) Sentry starts reporting this missing context error and filling up the dashboard.

So silencing that logger and having Sentry ignore that error resolved all of the issues for me. I have not noticed any missing data or other issues.

I apologize for the confusion.

sstoops commented 3 years ago

Honestly, I think what could really resolve this issue is to add more options to the context_missing parameter. Currently, there's no easy way to completely ignore it. Perhaps in addition to LOG_ERROR, there could be options like LOG_WARNING, or IGNORE. But that's a fight for another team. =)

untitaker commented 3 years ago

Sorry what's the context_missing parameter?

sstoops commented 3 years ago

That is an AWS X-Ray parameter. You can see this used in @koleror 's code above.

To summarize, by default, if X-Ray attempts to record anything about a segment being open, it will raise an error. You can set this configuration value to "LOG_ERROR" to, instead, have it just log an error. It's this logged error that is freaking Sentry out. There is no way currently in the X-Ray SDK to just completely silence the error in these cases when we know the error is going to occur in a spawned thread.

sentrivana commented 1 year ago

Please reopen if this is still an issue.