aws / aws-xray-sdk-python

AWS X-Ray SDK for the Python programming language
Apache License 2.0
327 stars 143 forks source link

Avoid infinite recursion in metadata_to_dict #303

Open huonw opened 3 years ago

huonw commented 3 years ago

Following from https://github.com/aws/aws-xray-sdk-python/issues/298:


It'd be also really good if this code [aws_xray_sdk.core.utils.conversion.metadata_to_dict] didn't overflow the stack on circular objects like this, because doing so is:

For the serialization of metadata customer added, it seems like there is no more intelligent way but do recursion

One option that's still recursion is to manually avoid re-serialising things. This is pretty similar to what currently happens, in that nested objects will be replaced by {}:

def metadata_to_dict(obj):
    def _recur(obj, seen_above):
        instance_id = id(metadata)
        if instance_id in seen_above:
            # already serialised this object, avoid recursion
            return {}
        seen_above.add(instance_id)

        try:
            # ... existing implementation, but with _recur(..., stack), not metadata_to_dict
        except:
            # ...
        finally:
            seen_above.remove(instance_id)

    return _recur(obj, [])

The original example would then become something like {"x": {}}, and and similarly for:

x = X()
y = Y()
z = Z()

x.foo_x = y
y.foo_y = z
z.foo_z = x

x.bar_x = z

metadata_to_dict(x) 
# something like (note: z is still serialised twice)
# {"foo_x": {"foo_y": {"foo_z": {}}}, "bar_x": {"foo_z": {}}}

Thanks for providing the idea. One possibly very common case I would concern is that, for metadata that is very big but has no mutual references at all, walking through the new recursion will not only require the same time complexity, but will also introduce new space complexity for maintaining a temp list as big as the metadata. I can see this is not a very soon (but definitely worthwhile) enhancement we would consider to update at this moment.

lupengamzn commented 3 years ago

Thanks, the other thing I can come up with is that, removing the duplicate reference in trace context may not be able to fully render the actual structure of the metadata. And in some cases where the mutual references are a lot and simply replacing them with empty dict may cause more confusion (cause people don't know where these objects actually point to) unless the developer has a very clear picture of the mutual reference for each metadata. But anyway, let's leave it open to see how many customers encounter this issue.

hayata-yamamoto commented 2 years ago

+1

slawrence93 commented 11 months ago

+1

bepetersn commented 8 months ago

+1

bjornhansen commented 4 months ago

+1

adamandrews commented 1 month ago

+1, encountering this with sqlalchemy engine.