dotnet / EntityFramework.Docs

Documentation for Entity Framework Core and Entity Framework 6
https://docs.microsoft.com/ef/
Creative Commons Attribution 4.0 International
1.58k stars 1.94k forks source link

Improve nullability guidance for DbSets #2520

Closed roji closed 3 years ago

roji commented 3 years ago

Thanks @davidroth for suggesting this.

See https://github.com/dotnet/efcore/issues/21608

roji commented 3 years ago

Note that calling Set<T>() for each query (or rather, DbSet referencing) has some perf impact (e.g. there's an internal dictionary lookup). This can be mitigated by DbContext pooling, or by force-initializing to null! as before.

roji commented 3 years ago

One last trick would be to do the following:

class MyDbContext
{
        public DbSet<Order> Orders { get; }

        public SomeDbContext()
        {
            Orders ??= Set<Order>();
        }
}

Since the constructor initializes Orders, there's no nullability-related compiler warning. ~Since the DbContext base constructor always initializes Orders, Set will never actually be called.~ (the base constructor only initializes properties with setters...)

ajcvickers commented 3 years ago

@roji Not sure I like that. It's adding dead code that will never be called just to satisfy the compiler.

roji commented 3 years ago

I agree it isn't ideal, but there are a number of solutions here, everyone is free to use what they like... this is an unfortunate result of the somewhat odd practice of DbContext using reflection to initialize properties in sub-classes. The solution of having the property call Set() every time is probably OK for the general case.

ajcvickers commented 3 years ago

the somewhat odd practice of DbContext using reflection to initialize properties in sub-classes

Pure magic! I've always been surprised how people are happy to accept that that their DbSets are non-null without questioning how that could be the case.

roji commented 3 years ago

I admit I didn't question it either... until I did...

irontoby commented 3 years ago

I agree it isn't ideal, but there are a number of solutions here, everyone is free to use what they like... this is an unfortunate result of the somewhat odd practice of DbContext using reflection to initialize properties in sub-classes. The solution of having the property call Set() every time is probably OK for the general case.

But isn't this documentation misleading now? You're not specifying that there are "a number of solutions"; you're making a specific recommendation for one of those solutions.

I'd also suggest that the wording of your recommendation is misleading; it mentions "make your DbSet properties read-only and initialize them as follows", but that code doesn't initialize a read-only property. This will likely cause confusion since the difference between => Foo(); and { get; } = Foo(); (an actual initializer) is (IMO) already easy to miss.

roji commented 3 years ago

@irontoby good points, I've submitted #2577 to improve this - let me know what you think.

irontoby commented 3 years ago

@roji agree the updated text is clearer, thanks for listening. Here's hoping that C# 9 source generators will make this problem go away :)

davidroth commented 3 years ago

@irontoby

agree the updated text is clearer, thanks for listening. Here's hoping that C# 9 source generators will make this problem go away :)

I´m not sure what you are hoping for source generators to change regarding DbSet declarations on DbContext? With public DbSet<Order> Orders => Set<Order>(); syntax everything is fine today and its already a very natural and clean way of declaring.

irontoby commented 3 years ago

@davidroth

I´m not sure what you are hoping for source generators to change regarding DbSet declarations on DbContext? With public DbSet<Order> Orders => Set<Order>(); syntax everything is fine today and its already a very natural and clean way of declaring.

As mentioned above, calling the Set method on every property access adds a bit of overhead, and I feel like the public DbSet<Order> Orders => Set<Order>(); syntax adds extra noise just to satisfy the compiler.

With a source generator, the Orders ??= Set<Order>(); syntax could be injected into a default constructor in a partial class, and we can keep the cleaner DbSet<Order> Orders { get; set; } property syntax.

roji commented 3 years ago

@irontoby that looks like https://github.com/dotnet/EntityFramework.Docs/pull/2520#issuecomment-661855000, which you can manually do today. Generating these directives with a source generator and a partial class seems a bit overkill to me for this problem - DbSet properties aren't typically added/removed that frequently - but it's an interesting idea!

davidroth commented 3 years ago

@irontoby

As mentioned above, calling the Set method on every property access adds a bit of overhead

Note that you will have less overhead when using public DbSet<Order> => Set<Order>() instead of eager initializing all properties in the ctor. Usual usage of DbContext is instance per UOW/Web Request where usually you only access a small fraction of the available DbSets on each request. With the expression-bodied getter you lazy access only what you need. Either way, its by no way a really measurable difference, so not worth the discussion. Just wanted to clarify that its definitely not worse than the current approach (eagerly init all properties via reflection).

and I feel like the public DbSet<Order> Orders => Set<Order>(); syntax adds extra noise just to satisfy the compiler.

IMO this is not only to satisfy the compiler. The expression bodied getter approach is also semantically more correct, as you get a immutable get-only property instead of a mutable property, which is a nice side effect. For me this does not feel like noise. It feels like the "correct" way to declare it using that style.

With a source generator, the Orders ??= Set<Order>(); syntax could be injected into a default constructor in a partial class, and we can keep the cleaner DbSet<Order> Orders { get; set; } property syntax.

I´m with roji here, that this feels like a solution for a problem which does not exist. Adding new DbSets isn´t something that happens that often. Also there is very little difference between declaring public Set<Order> Orders { get; set; } over public DbSet<Order> Orders => Set<Order>();. When using a snippet, you even have the same typing effort. As mentioned above, the latter has the nice benefit of immutability ;-)

irontoby commented 3 years ago

@davidroth good points, and agree an immutable get-only property is better; I was looking at it from the standpoint that DbContext is going to do the reflection either way so if I could hide away a bit more magic to make the compiler happy then there's no need to worry about it.

ajcvickers commented 3 years ago

I still don't like the additional guidance. (To clarify, I mean the nullable field example, not the expression body example. The expression body syntax would probably be the default now if it had existed back when EF 4.1 was designed.) The cost of calling .Set is minimal and EF already caches and returns the same instance on subsequent calls. If we really want to avoid this overhead, then why not just set the Set properties in the constructor?

    public CommunitiesContext()
    {
        Communities = Set<Community>();
        People = Set<Person>();
    }

    public DbSet<Community> Communities { get; }
    public DbSet<Person> People { get; }

This is effectively just doing what EF is doing for you.

irontoby commented 3 years ago

@ajcvickers I agree now, based on the discussion above. My main qualm was with the term "initialize" which implied it was only set during construction. Thanks for clarifying everyone.