zzzprojects / EntityFramework.DynamicFilters

Global filtering for Entity Framework.
https://entityframework-dynamicfilters.net/
MIT License
501 stars 86 forks source link

Why filters should come last in the model creation? #170

Open julealgon opened 3 years ago

julealgon commented 3 years ago

Description

The documentation on dynamic filters mention this:

Filters should always follow any other model configuration - including the call to the base.OnModelCreating() method. It is best to make the filter definitions the final step of OnModelCreating() to make sure that they are not in effect until the entire model is fully configured.

However, it is not clear to me what exactly are the consequences of having the filters defined earlier in the method.

Currently, we have an application with a very complex multi-tenant database structure and we are experiencing StackOverflowExceptions in a few cases randomly. I noticed that our own dbcontext does have the dynamic filters for multi-tenancy and soft deletion at the very beginning of OnModelCreating , (which goes against the recommendation) and wondered if this could have anything to do with the huge stack traces we are getting from EF. In some cases, plan compilation goes crazy into tree recursion for 4000+ calls, so something must be generating incredibly deep ASTs for it to visit.

Could you elaborate (and perhaps update the docs) on what exactly are the problems or behaviors that could come from having the filters not at the end of the method?

JonathanMagnan commented 3 years ago

Hello @julealgon ,

Unfortunately, I don't have this answer.

We are not the original author of this library, so some concepts are still missing from us.

I could throw you some reason, but could not be sure either at 100% if I'm right or now. The source is available, so you can search on your side but when I checked, I didn't find any real reason yet.

I don't think in your case even if registering it after that will make a big change.

If you could provide the stack trace (or a project that reproduces it will be even better) for when this error happens, we will be happy to look at it. I might be wrong but It could be a "false" StackOverflowException (when the stack is too depth, this error will be thrown even if this is not a real Stack Overflow). However, this is hard to investigate if we don't know which methods cause this issue.

Best Regards,

Jon


Performance Libraries context.BulkInsert(list, options => options.BatchSize = 1000); Entity Framework ExtensionsEntity Framework ClassicBulk OperationsDapper Plus

Runtime Evaluation Eval.Execute("x + y", new {x = 1, y = 2}); // return 3 C# Eval FunctionSQL Eval Function

julealgon commented 3 years ago

@JonathanMagnan thanks for your response.

Here is a sample stacktrace that we shared with the EF team here: https://github.com/dotnet/ef6/issues/1819

callStack.txt

It seems to get stuck on Plan Compilation phase, probably due to how deep the actual syntax tree becomes (that's the assumption). Our team is trying really hard to investigate this for a few months now without much success, so any help will be hugely appreciated.

I suspected that one possible explanation for it would be that due to the huge amount of multi-tenant models we have in our context, that the where conditions being added by DynamicFilter could be causing this issue, which is why I started looking into the documentation here again.

I might be wrong but It could be a "false" StackOverflowException (when the stack is too depth, this error will be thrown even if this is not a real Stack Overflow).

What do you mean by this? A StackOverflowException is supposedly only thrown due to the stack reaching the limit in the given thread. Can you give more details on this "false" SOE?

JonathanMagnan commented 3 years ago

Here is an example of what I call a "fake" stack overflow:

public static int FakeStackOverflow(int i)
{
    if (i == 100000)
    {
        return i;
    }

    return FakeStackOverflow(i + 1);
}

var x = FakeStackOverflow(0);

A Stack Overflow error will be throw but we can clearly see that everything is valid and the recursion will eventually end. Perhaps that's my wording which is bad. What I mean to say is sometimes a code will end to be solved but due to the stack limit, it cannot go deeper and throw the error.

JonathanMagnan commented 3 years ago

Thank for the additional information, I will assign this request to my developer to let him check if he could find something using your example

julealgon commented 3 years ago

A Stack Overflow error will be throw but we can clearly see that everything is valid and the recession will eventually end. Perhaps that's my wording which is bad. What I mean to say is sometimes a code will end to be solved but due to the stack limit, it cannot go deeper and throw the error.

Ah ok, I see what you are referring to. Yeah, there is a difference between "something that is very deep, but will eventually finish given enough stack size", and an "infinite call stack that is guaranteed to never finish".

Our case is always of the first: an extremely deep stack that is bounded by the original query, but that eventually finishes. However, even with a 1.5MB stack (we changed the default stack size to test) we are still observing some cases where it happens. We believe there must be something really wrong somewhere to cause such extremely long call stacks.

Unfortunately, we can't easily migrate away from full framework + EF6, so we are stuck trying to make it work there.

Thank for the additional information, I will assign this request to my developer to let him check if he could find something using your example

That's really great to hear @JonathanMagnan . Please do let us know if you find anything as it would be super helpful in diagnosing what the issue might be in our application.

JonathanMagnan commented 3 years ago

Hello @julealgon ,

My developer started to look at it but got some issue trying to reproduce it even with the code you provided on the EF6 issue.

Do you think you could provide him a runnable project that compiles and he could run? You can send it in private here info@zzzprojects.com if needed.

julealgon commented 3 years ago

@JonathanMagnan

My developer started to look at it but got some issue trying to reproduce it even with the code you provided on the EF6 issue.

I see. Yeah, the code I posted on EF's project is a sample oversimplification of our model. I lacks several dozen navigation properties models. I was hoping that by seeing the main model and the query, they'd be able to spot something that we might be missing.

Do you think you could provide him a runnable project that compiles and he could run? You can send it in private here info@zzzprojects.com if needed.

Unfortunately, I don't see how I'd be able to share a running code with you. Our model is extremely complex and several levels deep, so even if I attempt to trim it down either it would take too long, or it would become unrealistic again. I can't share the project as-is since it is a proprietary solution.

I was thinking about maybe investigating an approach using some sort of obfuscation library, but I've never used those. Do you have any suggestions on how we could make this work?

JonathanMagnan commented 3 years ago

Honestly, if you cannot share the code or reproduce it in a simple way, we will probably choose to close this issue.

You can sure obfuscate it but then, most classes will still be public so I don't really see the point here.

julealgon commented 3 years ago

...or reproduce it in a simple way

@JonathanMagnan , we can't even reproduce the issue consistently ourselves... the query sample posted in the EF project is just one out of many completely different queries that is randomly causing `StackOverflowExcetion's on our end.

My hope was that your team could share feedback based on the shape of the models/query and the filter being provided. For example, pretty much all of our models have the multi-tentant OrgId field, so the dynamic filter would apply to basically every class and every relation in our entire model. Do you think this could be the cause for these issues?

Even if the relationships are all virtual, will DynamicFilter apply the filtering logic to each and every relationship, or will it just apply it to the "visible" parts of the relationships? Most of our models have several dozen relationships to other models that in turn have several dozen relationships themselves.

This is why I was asking about the position of the filter declarations: would it matter in a complex scenario like this where the filters are defined (at the start of OnModelCreating vs at the end)?

I understand your stance on wanting to close the issue, but please understand our side here: we cannot share the whole project with you, as I said, it is a very complex proprietary project. This would be a lot easier if it was open source.

JonathanMagnan commented 3 years ago

If you cannot share the project, what about adding EF Dynamic Filters source directly in your project?

So you will be able to debug it within the source itself and find out more about what's happening if you can reproduce it from time to time by adding some log or directly via the debugger

so the dynamic filter would apply to basically every class and every relation in our entire model. Do you think this could be the cause for these issues?

I don't think it what cause the issue, but again I might be wrong since we are not able to reproduce it.

julealgon commented 3 years ago

If you cannot share the project, what about adding EF Dynamic Filters source directly in your project?

So you will be able to debug it within the source itself and find out more about what's happening if you can reproduce it from time to time by adding some log or directly via the debugger

That's a good idea. The problem is that the stack overflow doesn't happen inside DynamicFilter calls. It looks like the filters are added at some point, and then the query is executed on top of the modified query, and that's when it fails.

One thing we could do maybe is save the "before" and "after" AST and see how much bigger it gets after passing through the filters. Do you think there would be some sort of extension point we could leverage to do that without having to grab the whole source?