datalust / serilog-sinks-seq

A Serilog sink that writes events to the Seq structured log server
https://datalust.co/seq
Apache License 2.0
239 stars 50 forks source link

Serializing entities from Entity Framework exhausts server resources #98

Closed kinetiq closed 6 years ago

kinetiq commented 6 years ago

Using Serilog.Sinks.Seq 3.4.0, a dev on my team crashed a web server with this line of code:

Log.Error(ex, "Failed to generate email confirmation token for this contact: {@contact}", contact);

Here, contact is an EF entity. I imagine Serilog traverses the navigation properties and tries to load the entire database, but ends up navigating back to the contact, and then loops infinitely. Something like that - the IIS worker thread went up to about 8 gigs and 99% CPU utilization and then crashed, and restarted, repeatedly, because there was code like this in a job than auto-runs once per minute.

This may be expected behavior, and it is clearly something you should never do. And also, this thread may belong in Serilog proper. However, because logging tools strive to avoid entanglement in performance issues, I thought I would mention it.

Maybe there's a way to detect ORM entities and handle them differently (EF Proxy wrappers, limit the depth of traversal, etc)? Or maybe kick off an email to The Daily WTF let 'em burn? :)

I joke but there was an actual economic cost to this happening. IMO, it would have been best to block it by limiting the depth of traversal, assuming that's even a real factor.

nblumhardt commented 6 years ago

Thanks for the note, Brian ๐Ÿ‘

This one is part of serilog/serilog, in that the destructuring/capturing process happens before anything hits the Seq sink.

The problem here is likely to be a combination of the graph size and possibly lazy loading, if you have it enabled. Serilog does already have a maximum depth limit; it's set to 10 by default but you can use Destructure.ToMaximumDepth(n) to reduce it.

Even for depth 10, things can get big/slow if either there are large collections (see further options under Destructure. or if the links take a long time to traverse (e.g. EF lazy loading).

It would be interesting to try building a plug in "safety net" destructuring policy that knows about problematic scenarios such as EF, but it'd be challenging to ever cover enough breadth to be useful. Detecting web-related types (HttpContext) and mapped entities seems like it'd be possible.

Still probably not one we can action in this repo, though.

kinetiq commented 6 years ago

Yes, I'm sure lazy loading factors into this, since that would cause additional DB roundtrips.

Unclear why this would cause such a massive perf issue if there is a depth limit of 10 though - a lot can happen in 10 levels, but this did take the server down, which seems like a bit much. I thought there was probably an infinite loop at work... Maybe it was actually loading the entire DB several times via multiple paths. :)

Would there be any merit to a much more conservative default, to prevent accidents like this? Like maybe... 5? Or even 3? Was it determined that a default depth of 10 was appropriate, or was that arbitrary?

Any point in moving this to serilog/serilog?

nblumhardt commented 6 years ago

Thanks for the follow-up.

Maybe it was actually loading the entire DB several times via multiple paths. :)

That's my guess, too :-)

The default is tough to anchor to anything concrete, as structured data comes in all shapes and sizes. Property traversals can have huge side-effects, so there's really no safe level of serialization for types not designed for it.

Looking back, a better approach might be to limit the absolute number of nodes, rather than the tree depth, as the limiter? I think we'd still want the limit high enough that accidental serialization is noticed, but we might have more hope of surviving the very worst cases...

Definitely a worthwhile discussion, if you're keen to shift this over to the core repo. Thanks!

tsimbalar commented 6 years ago

Specifically for Entity Framework, I wonder if it would be possible to create some kind of destructuring policy that would prevent from loading data from the db, so something that would walk the object graph and for each node :

I'm still not sure whether that's doable without an instance of the DbContext at hand ...

That wouldn't solve the cycles, though ;)

I still believe the safest way is to disable lazy-loading in your DbContext altogether, but not everybody seems to agree :)

nblumhardt commented 6 years ago

Even with special handling, EF entities are rarely suitable for serialization - circular references, etc.

Perhaps another approach would be to add a "blacklist" of sorts to Serilog Analyzer? ๐Ÿค”

nblumhardt commented 6 years ago

Looking into it, there doesn't really seem to be a reliable way of statically detecting EF types.

Perhaps a better option would be to bite the bullet and accept the overhead of circular-reference detection (along with node-count limits).

kinetiq commented 6 years ago

Bummer that we can't reliably detect EF entities. @tsimbalar's ideas seemed to have some merit (detecting proxy types, etc).

It does seem that finding better ways to make sure we stop iterating at a more reasonable point would solve all the things.

I can see why you might need to check for circular references and not just count nodes... How would you do that, a collection of unique hashids? And its count is your node-count? And if you ever encounter a non-unique node, you just stop iterating?

nblumhardt commented 6 years ago

๐Ÿ‘ something along those lines; it'd need substantial thought put into the design/implementation.

nblumhardt commented 6 years ago

Closing this as I don't think we can address it in the Seq sink, and don't have a great proposal lined up for serilog/serilog - keeping an eye on it as things evolve, though :-)