Query on owned entity produces overly complicated SQL

matteocontrini commented 5 years ago

When querying an entity and filtering on an owned entity the SQL query that is produced includes a LEFT JOIN that could be avoided.

Steps to reproduce

Entites:

class Order
{
    public int Id { get; set; }
    public string Title { get; set; }
    public Address Address { get; set; }
}

class Address
{
    public string Street { get; set; }
    public string City { get; set; }
}

Model configuration:

modelBuilder.Entity<Order>().OwnsOne(x => x.Address);

The database table that is created looks like this:

immagine

A simple query like:

context.Orders.Where(x => x.Address == null).ToList();

Produces this SQL:

SELECT o."Id", o."Title", t."Id", t."Address_City", t."Address_Street"
      FROM "Orders" AS o
      LEFT JOIN (
          SELECT o0."Id", o0."Address_City", o0."Address_Street", o1."Id" AS "Id0"
          FROM "Orders" AS o0
          INNER JOIN "Orders" AS o1 ON o0."Id" = o1."Id"
          WHERE (o0."Address_Street" IS NOT NULL) OR (o0."Address_City" IS NOT NULL)
      ) AS t ON o."Id" = t."Id"
      WHERE (t."Id" IS NULL)

Which is overly complicated. The columns Address_City and Address_Street are available on the Orders table without any JOIN.

Same thing when querying a specific owned entity property:

context.Orders.Where(x => x.Address.City == "Rome").ToList();

SELECT o."Id", o."Title", t."Id", t."Address_City", t."Address_Street"
      FROM "Orders" AS o
      LEFT JOIN (
          SELECT o0."Id", o0."Address_City", o0."Address_Street", o1."Id" AS "Id0"
          FROM "Orders" AS o0
          INNER JOIN "Orders" AS o1 ON o0."Id" = o1."Id"
          WHERE (o0."Address_Street" IS NOT NULL) OR (o0."Address_City" IS NOT NULL)
      ) AS t ON o."Id" = t."Id"
      WHERE (t."Address_City" = 'Rome') AND (t."Address_City" IS NOT NULL)

Further technical details

Example project (PostgreSQL): EfCoreOwnedEntity.zip

EF Core version: 3.0.0 Database provider: Npgsql.EntityFrameworkCore.PostgreSQL 3.0.1 Target framework: .NET Core 3.0 Operating system: Windows 10 1903 IDE: e.g. Visual Studio 2019 16.3.2

ajcvickers commented 5 years ago

@smitpatel to investigate.

smitpatel commented 5 years ago

Legit generated SQL.

AndriySvyryd commented 5 years ago

@smitpatel I think in these cases we could get rid of the join since the outer filter is more restrictive. However we would need to first profile whether this would result in any measurable perf improvement besides just being a simpler query.

salaros commented 5 years ago

Legit generated SQL.

@smitpatel Not if you are not using owned entities with table splitting.

For example we are using owned entity for audit-related information

[Owned]
public class AuditLog
{
    [Column(nameof(IsDeleted), Order = 990)]
    public bool IsDeleted { get; set; }

    [Column(nameof(CreatedTime), Order = 991)]
    public DateTime CreatedTime { get; set; }

    [Column(nameof(ModifiedTime), Order = 992)]
    public DateTime? ModifiedTime { get; set; }

    [Column(nameof(CreatedBy), Order = 993)]
    public string CreatedBy { get; set; }

    [Column(nameof(ModifiedBy), Order = 994)]
    public string ModifiedBy { get; set; }
}

We put this entity on many multiple entities (e.g. manufacturers, products, product translations etc), therefore our EF Core-generated queries are monstrous.

This simple expression

var mfgsWithProducts = dbContext
              .Set<Manufacturer>()
              .Include(m => m.Products)
              .ThenInclude(p => p.Translations)
              .ToList();

results in

SELECT ....
FROM [Manufacturers] AS [m]
LEFT JOIN (
    SELECT ....
    FROM [Manufacturers] AS [m0]
    INNER JOIN [Manufacturers] AS [m1] ON [m0].[Id] = [m1].[Id]
    WHERE [m0].[IsDeleted] IS NOT NULL AND ([m0].[CreatedTime] IS NOT NULL AND [m0].[CreatedBy] IS NOT NULL)
) AS [t] ON [m].[Id] = [t].[Id]
LEFT JOIN (
    SELECT ....
    FROM [Products] AS [p0]
    LEFT JOIN (
        SELECT ....
        FROM [Products] AS [p1]
        INNER JOIN [Products] AS [p2] ON [p1].[Id] = [p2].[Id]
        WHERE [p1].[IsDeleted] IS NOT NULL AND ([p1].[CreatedTime] IS NOT NULL AND [p1].[CreatedBy] IS NOT NULL)
    ) AS [t0] ON [p0].[Id] = [t0].[Id]
    LEFT JOIN (
        SELECT ....
        FROM [ProductTranslations] AS [p3]
        LEFT JOIN (
            SELECT ....
            FROM [ProductTranslations] AS [p4]
            INNER JOIN [ProductTranslations] AS [p5] ON [p4].[Id] = [p5].[Id]
            WHERE [p4].[IsDeleted] IS NOT NULL AND ([p4].[CreatedTime] IS NOT NULL AND [p4].[CreatedBy] IS NOT NULL)
        ) AS [t1] ON [p3].[Id] = [t1].[Id]
    ) AS [t2] ON [p0].[Id] = [t2].[ProductId]
) AS [t3] ON [m].[Id] = [t3].[ManufacturerId]
ORDER BY [m].[Id], [p].[ManufacturerId], [p].[Id], [t3].[Id], [t3].[Id1]

this is ridiculous, since we use owned entities without table splitting, therefore we don't need to left join table to themselves

salaros commented 5 years ago

Any news?

ajcvickers commented 5 years ago

@salaros This issue is in the Backlog milestone. This means that it is not going to happen for the 3.1 release. We will re-assess the backlog following the 3.1 release and consider this item at that time. However, keep in mind that there are many other high priority features with which it will be competing for resources.

dahumadaatgmail commented 4 years ago

any workaround? I have extensive use of Owned entities and the query with left joins on the same table generates time-out because of the complexity. It could be a simple select * but EF generates 10 left join queries on the same table. I tried to use FromSqlRaw but the result is just another join

salaros commented 4 years ago

any workaround?

for now we stopped using owned entities

dahumadaatgmail commented 4 years ago

Because of the extensive use of owned entities, I can't stop using it. As a workaround I built a database view as a plain object and I referenced the view from my context. By now this has been working (the real Order entity has more owned entities than showed):

Entities:

public class Order{
   public string Number{get;set;}
   public Person BillTo{get;set;}
   public Person InvoicedTo{get;set;}
}

[Owned]
public class Person {
  public string TaxID{get;set;}
  public string Name{get;set;}
  public string Address{get;set;}
}

this is the table:

create table Orders (
  Id varchar(100)
  BillTo_TaxID varchar(100)
  BillTo_Name varchar(100)
  BillTo_Address varchar(100)
  InvoicedTo_TaxID varchar(100)
  InvoicedTo_Name varchar(100)
  InvoicedTo_Address varchar(100)
)

Now, I created a view from my table:

create view VwOrders as select * from dbo.Orders;

And in my dbContext I added the view as an Entity:

public DbSet<VwOrder> VwOrders { get; set; }

And in the builder:

modelBuilder.Entity<VwOrder>(eb => {
                eb.HasNoKey();
                eb.ToView("VwOrders");
            });

The VwOrder class:

public class VwOrder {
    public string Id {get;set;}
    public string BillTo_TaxID {get;set;}
    public string BillTo_Name {get;set;}
    public string BillTo_Address {get;set;}
    public string InvoicedTo_TaxID {get;set;}
    public string InvoicedTo_Name {get;set;}
    public string InvoicedTo_Address {get;set;}
    public Order ToOrder(){
        var ret = new Order {
            Id = this.Id,
            BillTo = createEntity<Person>("BillTo"),
            InvoicedTo = createEntity<Person>("InvoicedTo")
        }
        return ret;
    }
}

And the method createEntity:

        private T createEntity<T>(string prefix) where T : new() {
            var datos = new T();
            var myprops = this.GetType().GetProperties().Where(x => x.CanRead).ToDictionary(x => x.Name);
            var props = datos.GetType().GetProperties().Where(x => x.CanWrite).ToArray();
            foreach(var prop in props) {
                try {
                    prop.SetValue(datos, myprops[prefix + "_" + prop.Name].GetValue(this));
                } catch(Exception ex) {
                    Console.WriteLine($"{ex}");
                }
            }
            return datos;
        }

To query, I just made something like:

var order = bd.VwOrders.Where(x=>x.Id == "xx").AsNoTracking().FirstOrDefault()?.ToOrder();

chapinmark commented 4 years ago

@salaros

Did you find that the nested joins hurt your query performance? Mine went from nearly negligible (around 1ms) with ef core 2.2 to 1.5 seconds with ef core 3.0 (200k rows in the table).

salaros commented 4 years ago

Did you find that the nested joins hurt your query performance? Mine went from nearly negligible (around 1ms) with ef core 2.2 to 1.5 seconds with ef core 3.0 (200k rows in the table).

Yeap, especially with orderby and global filters, but as they say

Legit generated SQL

I'm planning to create a PR ASAP with a fix for owned entities (without table splitting). Unfortunately EF Core's release policies are very strange, so there is no way to tell if it's gonna make it for 3.1. In general it seems like the main goal is to gradually kill this project.

smitpatel commented 4 years ago

Legit generated SQL.

Just to be clear for everyone going in tangential direction. My comment says in other words, "yes, we generated that SQL & we should fix it".

so there is no way to tell if it's gonna make it for 3.1

Not just EF Core, any other open source project on github, look at the milestone of the issue tracking any item. The milestone will tell which release it was or will be fixed. This issue is in backlog milestone and it is not going to happen for 3.1. Release 3.1 is already finalized and going through testing phase.

As for submitting a fix for this issue, this issue is not marked as "good-first-issue" hence we believe it is fairly complex issue. We still encourage you to work on it if you wish. But make sure to discuss design of the fix with us first (by commenting in this issue). If you submit PR directly and if that is incorrect fix, we will not accept the contribution.

ZimM-LostPolygon commented 4 years ago

Inefficient SQL is one thing. But is there a reason for EF to generate non-nullable owned entity property columns as nullable? If I have something like this:


public class Device {
    [Key]
    public string Id { get; set; }

    [Required]
    public DeviceBasicStatistics BasicStatistics { get; set; }
}

public class DeviceBasicStatistics {
    // long is not nullable, yet the column is generated as nullable
    public long ReportCount { get; set; }
}

I would expect ReportCount to default to 0, and since 0 is not null, BasicStatistics will always be created. However, BasicStatistics_ReportCount is generated as nullable, and if the ReportCount is set to null for whatever reason, Device.BasicStatistics is not loaded and remains null, which breaks the expectations.

Is this a separate issue?

ajcvickers commented 4 years ago

@ZimM-LostPolygon See #12100

MorenoGentili commented 4 years ago

I think this should be marked as a type-bug instead of a type-enhancement. In fact, as more rows are added to a table, performance progressively degrades to the point it becomes unusable. Users could not be fully aware of this problem; maybe Microsoft should issue an official statement to discourage using owned types in EFCore 3.0.

My model is identical to @matteocontrini's except for the fact it has 2 owned type properties in my entity class instead of just 1. Here's the query generated by EFCore. It's way too complicated: there are LEFT JOINs of subqueries with nested INNER JOINs.

SELECT "t"."Id", "t"."Author", "t"."Description", "t"."Email", "t"."ImagePath", "t"."Rating", "t"."Title", "t2"."Id", "t2"."CurrentPrice_Amount", "t2"."CurrentPrice_Currency", "t1"."Id", "t1"."FullPrice_Amount", "t1"."FullPrice_Currency"
FROM (
    SELECT "c"."Id", "c"."Author", "c"."Description", "c"."Email", "c"."ImagePath", "c"."Rating", "c"."Title"
    FROM "Courses" AS "c"
    WHERE ((@__model_Search_0 = '') AND @__model_Search_0 IS NOT NULL) OR (instr("c"."Title", @__model_Search_0) > 0)
    ORDER BY "c"."Rating" DESC
    LIMIT @__p_2 OFFSET @__p_1
) AS "t"
LEFT JOIN (
    SELECT "c0"."Id", "c0"."CurrentPrice_Amount", "c0"."CurrentPrice_Currency", "c1"."Id" AS "Id0"
    FROM "Courses" AS "c0"
    INNER JOIN "Courses" AS "c1" ON "c0"."Id" = "c1"."Id"
    WHERE "c0"."CurrentPrice_Currency" IS NOT NULL AND "c0"."CurrentPrice_Amount" IS NOT NULL
) AS "t0" ON "t"."Id" = "t0"."Id"
LEFT JOIN (
    SELECT "c2"."Id", "c2"."FullPrice_Amount", "c2"."FullPrice_Currency", "c3"."Id" AS "Id0"
    FROM "Courses" AS "c2"
    INNER JOIN "Courses" AS "c3" ON "c2"."Id" = "c3"."Id"
    WHERE "c2"."FullPrice_Currency" IS NOT NULL AND "c2"."FullPrice_Amount" IS NOT NULL
) AS "t1" ON "t"."Id" = "t1"."Id"
LEFT JOIN (
    SELECT "c4"."Id", "c4"."CurrentPrice_Amount", "c4"."CurrentPrice_Currency", "c5"."Id" AS "Id0"
    FROM "Courses" AS "c4"
    INNER JOIN "Courses" AS "c5" ON "c4"."Id" = "c5"."Id"
    WHERE "c4"."CurrentPrice_Currency" IS NOT NULL AND "c4"."CurrentPrice_Amount" IS NOT NULL
) AS "t2" ON "t"."Id" = "t2"."Id"
ORDER BY "t"."Rating" DESC

And here's a quick benchmark I performed. The blue line represents a SQL query I typed by hand and the orange line is the query generated by the LINQ provider. As you can see, performance starts degrading very fast as more rows are added to the table. I'm talking about just 2000 rows in a Sqlite database. All needed indexes are in place. chart

VaclavElias commented 4 years ago

I am experiencing the same problem, was happy to use Owned entities till I realised that I have got 4 left joins to the same table. I am not going to use them, till this is fixed.

msneijders commented 4 years ago

If you use Nested owned types it gets a lot worse.

If you extend the model to:

    class Order
    {
        public int Id { get; set; }
        public string Title { get; set; }
        public Address Address { get; set; }
    }

    [Owned]
    class Address
    {
        public string Street { get; set; }
        public string City { get; set; }
        public PostalCode PostalCode { get; set; }
    }

    [Owned]
    class PostalCode
    {
        public string Area { get; set; }
        public string Zone { get; set; }
    }

then

context.Orders.ToList();

produces (postgresql):

SELECT o."Id", o."Title", t1."Id", t1."Address_City", t1."Address_Street", t5."Id", t5."Address_PostalCode_Area", t5."Address_PostalCode_Zone"
FROM "Order" AS o
LEFT JOIN (
    SELECT t0."Id", t0."Address_City", t0."Address_Street", o3."Id" AS "Id0"
    FROM (
        SELECT o0."Id", o0."Address_City", o0."Address_Street"
        FROM "Order" AS o0
        WHERE (o0."Address_Street" IS NOT NULL) OR (o0."Address_City" IS NOT NULL)
        UNION
        SELECT o1."Id", o1."Address_City", o1."Address_Street"
        FROM "Order" AS o1
        INNER JOIN (
            SELECT o2."Id", o2."Address_PostalCode_Area", o2."Address_PostalCode_Zone"
            FROM "Order" AS o2
            WHERE (o2."Address_PostalCode_Zone" IS NOT NULL) OR (o2."Address_PostalCode_Area" IS NOT NULL)
        ) AS t ON o1."Id" = t."Id"
    ) AS t0
    INNER JOIN "Order" AS o3 ON t0."Id" = o3."Id"
) AS t1 ON o."Id" = t1."Id"
LEFT JOIN (
    SELECT o4."Id", o4."Address_PostalCode_Area", o4."Address_PostalCode_Zone", t4."Id" AS "Id0", t4."Id0" AS "Id00"
    FROM "Order" AS o4
    INNER JOIN (
        SELECT t3."Id", t3."Address_City", t3."Address_Street", o8."Id" AS "Id0"
        FROM (
            SELECT o5."Id", o5."Address_City", o5."Address_Street"
            FROM "Order" AS o5
            WHERE (o5."Address_Street" IS NOT NULL) OR (o5."Address_City" IS NOT NULL)
            UNION
            SELECT o6."Id", o6."Address_City", o6."Address_Street"
            FROM "Order" AS o6
            INNER JOIN (
                SELECT o7."Id", o7."Address_PostalCode_Area", o7."Address_PostalCode_Zone"
                FROM "Order" AS o7
                WHERE (o7."Address_PostalCode_Zone" IS NOT NULL) OR (o7."Address_PostalCode_Area" IS NOT NULL)
            ) AS t2 ON o6."Id" = t2."Id"
        ) AS t3
        INNER JOIN "Order" AS o8 ON t3."Id" = o8."Id"
    ) AS t4 ON o4."Id" = t4."Id"
    WHERE (o4."Address_PostalCode_Zone" IS NOT NULL) OR (o4."Address_PostalCode_Area" IS NOT NULL)
) AS t5 ON t1."Id" = t5."Id"

Indication of performance problem (tested using different model, but equivalent), using a table with 40,000 records:

Efcore 3.1 query: 500 ms Manual query: 100 ms.

If you use a Where filter, the performance difference gets a lot bigger. A filter selecting only 2 records (using index) from the table:

Efcore 3.1 query: 280ms Manual query: 1ms.

This makes the owned entity with table splitting feature not useful in practice.

lvmajor commented 4 years ago

Damnit, I found out about Owned types and really thought it was great. I implemented it in a table where I have ~50 properties that are owned types, and the query generated for a simple context.EntityDbSet.FirstOrDefault() has over 330 lines of code and 57 LEFT JOIN! :( 👎

The manual query I would have written would have been probably a single liner for this simple scenario... that's a real bummer and should be noted in the official documentation about Owned Entity Types

Moreover, I just tested another small variation and it seems to mess up even more, if I use a variable as a param in a Where or a FirstOrDefault call like .FirstOrDefault(x => x.Id == variableParam)..., the resulting query still contains the 57 Left Join... AND the result set does contain only null values for the owned types... this is really bad lol

salaros commented 4 years ago

@os1r1s110 The strangest thing is that this "feature" made it into .NET Core 3.1, which is LTS, even though this bug has been reported back in Oct 2019

Legit generated SQL

Yeah, sure, but only if owned entities live on a separate table.

gbrantzos commented 4 years ago

What is trully amazing, is the fact that this issue is not considered to be a "bug" but an "enhancment", giving the imperssion that currently EF works as expected, in an LTS version. I would really like to hear from the team the criteria that qualify an issue as a bug.

lvmajor commented 4 years ago

To be fair they have a lot of work done on EF Core and it's mostly really good, this one might have fell into a crack and I just hope they will get a chance to revisit to make this feature useful :)

I just saw that the principal interested is in vacation, so let's not hope for a quick response here.

I don't know who else could be looking at it ...

gbrantzos commented 4 years ago

@os1r1s110 I trully respect the work and effort put on EfCore, and I also accept the fact that there might be other issues with higher priority. I don't judge anybody, I am just curious about what qualifies as a bug and what not. To me, it's crystal clear that this behaviour is not working as expected, it's definatelly a bug.

davidames commented 4 years ago

This is the worst sort of bug - a latent bug that lulls you into a false sense of security that you are doing the right thing and everything is OK until BANG! Your queries are taking 2 minutes to execute and your system grinds to a halt.

ajcvickers commented 4 years ago

Discuss with @smitpatel and @AndriySvyryd

walik92 commented 4 years ago

I have the same problem, query is overly complicated, moreover result is incorrect. Steps to reproduce:

Entities:

public class CarDto : IDto
{
    public Guid Id { get; set; }
    public string Name { get; set; }
    public string Description { get; set; }
    public EngineDto Engine { get; set; }
    public Guid LanguageId { get; set; }
}

public class EngineDto
{
    public Guid Id { get; set; }
    public string Name { get; set; }
}

Model configuration:

modelBuilder.Entity<CarDto>().Property(q => q.Id).HasColumnName("CarId");
modelBuilder.Entity<CarDto>().Property(q => q.Name).HasColumnName("Name");
modelBuilder.Entity<CarDto>().Property(q => q.Description).HasColumnName("Description");

modelBuilder.Entity<CarDto>().OwnsOne(q => q.Engine, x =>
{
    x.Property(q => q.Id).HasColumnName("EngineId");
    x.Property(q => q.Name).HasColumnName("EngineName");
});

modelBuilder.Entity<CarDto>().ToView("vCars");

The database view looks like this:

The data looks like this:

The query in LINQ looks like this:

var data = _db.Set<CarDto>().Where(q => q.LanguageId == languageId).ToList();

This query produces SQL, which looks like this:

SELECT [v].[CarId], [v].[Description], [v].[LanguageId], [v].[Name], [t].[CarId], [t].[EngineId], [t].[EngineName]
FROM [vCars] AS [v]
LEFT JOIN (
    SELECT [v0].[CarId], [v0].[EngineId], [v0].[EngineName], [v1].[CarId] AS [CarId0]
    FROM [vCars] AS [v0]
    INNER JOIN [vCars] AS [v1] ON [v0].[CarId] = [v1].[CarId]
    WHERE [v0].[EngineId] IS NOT NULL
) AS [t] ON [v].[CarId] = [t].[CarId]
WHERE [v].[LanguageId] = @__languageId_0',N'@__languageId_0 uniqueidentifier',@__languageId_0='9EA1AD19-2C42-4755-B837-701C39E41D37'

and result is:

There are eight rows, sholud be only two, like that:

thijscrombeen commented 4 years ago

Any update on when this is going to be resolved?

ajcvickers commented 4 years ago

@thijscrombeen We're investigating options for releasing before 5.0.

ajcvickers commented 4 years ago

Update on this: we spent a long time figuring out if we could fix this in a patch release with sufficiently low risk. A complete fix is not looking very feasible, but we're still looking at tactical fixes for some cases.

Putting this in 3.1.x for that work. /cc @smitpatel

VaclavElias commented 4 years ago

How comes it got broken? Is it going to be actually re-done completely as whatever was working before is not fixable anymore?

smitpatel commented 4 years ago

@VaclavElias - It is not broken. It is altogether different thing. In previous version of EF Core, owned entities were required hence it generated simpler SQL. Due to #9005, they are now optional, hence we need to add more checks in SQL making it complicated to make sure we get correct results from server back. In order to go back to previous version's behavior fully, #12100 is required and you would need to configure the model according to that.

smitpatel commented 4 years ago

Filed #19932 for the fix which was added to the patch.

ajcvickers commented 4 years ago

Update on this issue

Background

Support for owned entities was introduced in EF Core 2.0. At that time we received significant feedback that forcing owned entities to be required was very limiting. Based on this we made owned relationships optional in EF Core 3.0.

What went wrong

The concept of what constitutes an "optional owned entity" is nuanced. (Or maybe it isn't, but I for one have had trouble getting my head around what it really means, both in terms of behavior and mapping.) Regardless, in adding the flexibility to allow owned entities to be optional, we didn't appropriately take into account the degraded queries produced in cases where the owned entity does not need to be optional.

So, while maybe the people asking for optional owned entities are happy, all the people who didn't need them to be optional are now seeing much worse queries.⁽¹⁾

What we should have done

We should not have replaced required owned entities with optional owned entities. We should have kept required owned entities and allowed optional owned entities to be configured. (We could also have changed the default as long as it was possible to get back to the old behavior.)

So why didn't we do this here? Because:

We didn't realize the degree to which the change would degrade queries and how impactful this would be. As described above, we misinterpreted feedback.
We thought that optional owned entities was an important feature to deliver. (This may still be true--since we did deliver it, we're not seeing feedback from anyone who would have said something if we hadn't done it.)
Supporting both across the stack is significantly more work than supporting one or the other.

So, in retrospect, we should have punted optional owned entities for 3.0 and tackled in a future release when we could at the same time keep supporting the old behavior.

What we are doing now

We have been investigating what we can do to improve these queries in a 3.1.x patch release. For example, see #19932.

Unfortunately the interaction between the model shape and the query pipeline is not making it easy to find tactical fixes that are suitable for patching. We will continue to pursue this, but it won't fix all the queries that were degraded by this change.

We're scheduling support for both optional and required owned entities in 5.0 for November. See #12100. We realize that November is a long time to wait. We are working hard to ensure our daily builds and previews are high-quality, so this may be an option for some people.

Looking forward

This kind of retrospective analysis is part of our ongoing development process. We made mistakes here, but as always we're learning and will feed these lessons into future design and planning. As always, we welcome constructive feedback on anything here. If you have other ideas for things we could do here, then please let us know!

Footnotes

⁽¹⁾ I usually refer to this as a "grass is greener" scenario. That is, given a bunch of people using something, those who are unhappy generally make a lot more noise than those who are happy. If all these people say, "we need it this way instead!" then it can start to seem, psychologically, that we made the wrong choice. So because everybody is saying that the grass is greener over there, we believe this and go with it.

But this is obviously flawed. Because the people happy with the current grass aren't being vocal about it. So it's not really everybody at all.

In cases like this we sometimes only really realize that we made a mistake when we've already jumped the fence and now everybody is saying, "Hey! The grass back where we came from was much better!"

lvmajor commented 4 years ago

@ajcvickers Thanks for the clarification/update.

I would have a question for you though. As I understand it, you seem to say it's one or the other (non optional owned entities with good queries, or optional owned entities with the bad queries....) Is that really mutually exclusive?

We thought that optional owned entities was an important feature to deliver. (This may still be true--since we did deliver it, we're not seeing feedback from anyone who would have said something if we hadn't done it.)

I for example, am one of the people who think that optional owned entities are an important feature (really important IMO) as I want to use it extensively.

A simple example for my case is to save test results for a given entity, on which not all tests are mandatory (depending on if the user has selected it or not). Instead of saving the generated report as a PDF, I save it in a table with owned entities (one for each test result, which embeds some other meta-data about the test and result). That wouldn't be possible at all if all owned entities were required. Now I wonder though why it isn't possible to have optional owned entities AND generate reasonable queries. In my use case, I could easily select all the owned entities in a single liner SELECT * FROM <table> where Id = x without requiring complex join queries if I wrote the query manually, but if I rely on EF Core to generate it, it does effectively create a really not optimal query as already mentioned... Wouldn't there be a way to generate these simple queries ?

TBN: I am super greatful for all the work that's been done in EF Core and I really don't want this to be taken as a "complaint" or anything, I'm legitimately trying to understand what prevents the query from being as simple as mentioned above when using a LINQ query like the following: _dbContext.TestResults.Where(x => x.Id == <id>).FirstOrDefault()

Thanks in advance!

ajcvickers commented 4 years ago

@os1r1s110 I'll talk to the team and check on the technical details. You may be correct, but I expect if that's the case, it still requires significant changes to the model/query pipeline equivalent to supporting both. (We will, of course, strive for good queries for both cases.) This is also complicated by the difference between owned entities sharing a table (table splitting) and owned entities mapped to their own table.

(I'm very much acting as a manager here. :-) @AndriySvyryd and @smitpatel understand the complications much better than I do.)

ErikEJ commented 4 years ago

Does "overly complicated" also mean bad performing?

salaros commented 4 years ago

Does "overly complicated" also mean bad performing?

This thread contains several benchmarks, showing how query performance degrades with each new .Include(). In my case I had the same owned type on almost all my entities (auditlog for tracking who and when created/last modified the those entities). EF Core-generated queries were so slow on .NET Core 3.1 LTS that I had to completely change my data model I've been successfully using since .NET Core 2.0 through all the updates.

davidames commented 4 years ago

Thank you for the additional context @ajcvickers - this is obviously a complicated issue for your team and I'm sure everyone here appreciates the effort you are putting in.

I'm sure there is a really good reason why you can't do this, but the nieve solution would be to push the optional processing up into c# land?

Eg, this is similar to what you are generating at the moment: SELECT t1.a, t2.owned_b, t2.owned_c FROM SomeTable t1 LEFT JOIN ( SELECT t.id, t.owned_b, t.owned_c FROM SomeTable t WHERE t.owned_b is not null and t.owned_c is not null ) t2 on t1.id = t2.id I think the whole point of that complexity in the query to say "if any property is null in the db, make the entire owned object null"

I would think you could achieve the same thing with SELECT t1.a, t1.owned_b, t1.owned_c FROM SomeTable t1

and some c# filtering.

I don't mean to tell your team how to "suck eggs" - obviously you know a lot more about this problem space than we do, but maybe sharing why this is so difficult would help us understand, as I'm sure I'm not the only one wondering this.

julielerman commented 4 years ago

I'm talking with a team that has SAME modeling as Dec 24 @msneijders (https://github.com/dotnet/efcore/issues/18299#issuecomment-568738570,) multiple levels of owned entities...and a simple query generating same monstrous SQL. After looking this over with them I agree with their solution to just use Dapper for the horribly performing queries. They do want to use and stay with EF Core, so maybe someday they'll be able to switch back. This also HUGELY impacts my conversation with devs all over the world about DDD & EF Core so I look forward to the fix. Finally, please add this to the breaking changes page in the docs.

brenwebber commented 4 years ago

@salaros

Did you find that the nested joins hurt your query performance? Mine went from nearly negligible (around 1ms) with ef core 2.2 to 1.5 seconds with ef core 3.0 (200k rows in the table).

Me too. Started getting timeouts on entities with many owned entities.

brenwebber commented 4 years ago

We are in the middle of a large DDD project using EF Core. We have recently upgraded to 3 and all our developers are complaining about the SQL generated from this issue. It has a massive impact.

Optional owned types is a nice to have for us, and I was excited for this feature, but at least that limitation had workarounds. The queries generated now for our required owned types (the majority) are not great, creating views is a desperate solution and certainly not logical for everything, So we are going forge on and hope/pray that there is a fix soon.

julielerman commented 4 years ago

@brenwebber conversation I just had with a client last week and this is their plan: if your EF calls are not already isolated/encapsulated and you have time/resources, refactor to isolate queries in their own repos/classes (always good to have that stuff separate from biz logic anyway). Find the queries that are causing pain with the change and separate them out further then switch them all to Dapper. (Not saying "all queries for app to be in one class" just standard separation of concerns to break things apart. At some point, when this is sorted out, it will be easier to switch those back to EF...if you want to. When I'd suggested EF Core/Raw SQL/Views they said that they'd done some comparisons and found Dapper, in many cases, to provide better perf that was significant enough to stick with Dapper. Also guessing you are doing views mapped to entities, not keyless entnties because those have limitations around owned entities. I need to write a blog post but haven't had a chance yet. Hope this is useful and not stating the obvious.

brenwebber commented 4 years ago

I am still holding out for a fix in 3.1.3. We still have some time before go live and the UAT performance (under low loads) is still fine.

I will re-evaluate in a month or 2 and if there is still no resolution, creating some Dapper based query services sounds like a better work around. Thank you @julielerman, I appreciate the advice.

smitpatel commented 4 years ago

@brenwebber - This is not going to be fully resolved in 3.1 release. Too risky change to put in a patch. cc: @ajcvickers

brayannluiz commented 4 years ago

I'm having the same issue as our fellow friends here. I'm moving most of my queries to Dapper. To be honest. once I finish, I'm pretty sure I'm not gonna change it back to EF. It's too much work.

julielerman commented 4 years ago

FWIW, I haven't personally compared perf of EF Core/Views/Raw SQL to Dapper. I played with it back with EF6 but not since. I'm definitely curious and I know that team I talked with was going to do some comparisons for their own queries.

rafaelfgx commented 4 years ago

This problem should not be treated as an improvement, but treated as a bug, because performance is an essential concern in an ORM. The simplest example highlights this bug, then indicates that the feature was developed, but has not been tested. Until resolved, Owned Types must be a feature NOT recommended in the official documentation.

davidames commented 4 years ago

In addition to @rafaelfgx comments on performance being an essential concern for a CRM, we are not talking about it being 20% slower or even 500% slower -we are talking about a bug which can easily cause an entire system to crash.

brenwebber commented 4 years ago

@smitpatel I would be very happy with a "partially" resolved / workaround / opt-in solution (using decorators or a linq extension etc.). We would prefer to stick with a single ORM as far as possible, and we still have some time before this becomes a huge issue for us, but more queries are being written every day so we are kind of banking on a solution in the next 6 months.

gbrantzos commented 4 years ago

It's trully amazing! After 5 months, 21 participants in this conversation, graphs that prove the degrade of performance, too many comments to count but this issue is still not considered a bug! I don't care if the grass is green or not, but I was under the impression that producing a valid and decent query to fetch data, is not an "enhancement".

Maybe I'm wrong...

ShenZZ commented 4 years ago

My projects has been upgraded 2.2 to 3.1, so embarrassment, now whether it's publish or waitu for Net5 for 7months!

dotnet / efcore