dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.75k stars 3.18k forks source link

Option to turn off automatic setting of navigation properties #11564

Open dgxhubbard opened 6 years ago

dgxhubbard commented 6 years ago

From loading related data EF Core page is the following tip:

Entity Framework Core will automatically fix-up navigation properties to any other entities that were previously loaded into the context instance. So even if you don't explicitly include the data for a navigation property, the property may still be populated if some or all of the related entities were previously loaded.

I would like an option to turn this property off so that if I choose not to load a navigation property it is not loaded for me. The cost of transporting data that is not needed is slowing response time. If there are 70 gages and each gage has a status then that status has 70 gages. There is a reference looping option mentioned it the loading related data article which I did try but the status object still contained the list of gages which I verified through postman.

Below is an example showing the loading of data and clearing the navigation properties that are automatically "fixed up" by EF Core. I am going to need this code in all of my gets in order to drastically reduce the JSON payload size,.

When we use an Include we know the data that is needed for an object. "Fixing up" is something that is not needed. Please remove this feature or please give us a way to disable it.

Code to load data:

try
{
  using ( var context = new MyContext() )
  {
    res =
        ( from c in
            context.Gages.
                 Include ( "Status" )
             select c ).ToList ();

    // code to clear nav property loaded by EF Core
                res.ForEach ( 
                    g =>
                    {
                        if ( g != null && g.Status != null && g.Status.Gages != null )
                            g.Status.Gages.Clear ();
                    } );

  }

}
catch ( Exception ex )
{
    throw ex;
}

Entities

[Table("Gages", Schema = "MYSCHEMA" )]
class Gage
{
#region Properties

    [Key]
    [System.Runtime.Serialization.DataMember]

    public virtual int Gage_RID
    { get; set; }

    [Required]
    [StringLength(255)]
    [System.Runtime.Serialization.DataMember]

    public virtual string Gage_ID
    { get; set; }

    [System.Runtime.Serialization.DataMember]

    public virtual int Status_RID
    { get; set; }

    #endregion

    #region Navigation Properties

    [System.Runtime.Serialization.DataMember]
    public virtual Status Status
    { get; set; }

    #endregion

}

[Table("Status", Schema = "MYSCHEMA" )]
public class Status
{
    #region Properties

    [Key]
    [DatabaseGenerated ( DatabaseGeneratedOption.Identity )]
    [System.Runtime.Serialization.DataMember]

    public virtual int Status_RID
    { get; set; }

    [Required]
    [StringLength(255)]
    [System.Runtime.Serialization.DataMember]

    public virtual string StatusName
    { get; set; }

    #endregion

    #region Navigation Properties

    [System.Runtime.Serialization.DataMember]

    public virtual ICollection<Gage> Gages
    { get; set; }

    #endregion
}

public partial class MyContext : DbContext
{
    public virtual DbSet<Gage> Gages
    { get; set; }

    public virtual DbSet<Status> Status
    { get; set; }

}
smitpatel commented 6 years ago

Or you could just use ViewModels...

dgxhubbard commented 6 years ago

We do use view model in both wpf and angular. what would a view model do to reduce the size of the data that if being set by ef core

dgxhubbard commented 6 years ago

In the wpf case you have references sitting around so they don't take up that much space. but sending the data over json increases the size drastically

smitpatel commented 6 years ago
var res =
( from c in
context.Gages
select new GageModel {
    .... // Assign scalar properties
    Status = c.Status
} ).ToList();

No references will be fixed up.

dgxhubbard commented 6 years ago

you referenced a view model rather that the getting of data. status is an object that is a nav property that has not been retrieved by ef so would this not set a null to status

dgxhubbard commented 6 years ago

we have our scalar props, these would be retrieved by the select, but if status had not been queried by an include then it would be null.

smitpatel commented 6 years ago

If you are referencing c.Status in your DTO then it will be populated by EF Core without needing any include.

dgxhubbard commented 6 years ago

ok but wouldn't ef core still fix up the Gages nav property on Status, getting back to the same point?

smitpatel commented 6 years ago

Where is the Gage being loaded in the query? Query is creating objects of GageModel & Status only.

dgxhubbard commented 6 years ago

I just tried your suggestion and looked at the json in postman and it works. But I would still want to have the option to turn this off ""Fixing things up". Its not a desirable feature for us. Yes this is a legitimate work around, but in our case Gage and Status are the DTO. So the objects have been created by the EF query why recreate more objects?

dgxhubbard commented 6 years ago

I have worked with EF prior to EF Core and maybe it is me being used to explicitly saying this is what I want using multiple includes and "." notation such as .Include( "mynav1.sub1" ).

ajcvickers commented 6 years ago

@dgxhubbard Few things to note here:

All that being said, we will discuss this in triage and consider if there if a different mode for Include would be useful.

Tarig0 commented 6 years ago

Isn't there a way to ignore a property when serializing to json?

ajcvickers commented 6 years ago

@Tarig0 Of course: see https://github.com/aspnet/Mvc/issues/4160

dgxhubbard commented 6 years ago

I know in EF 6 that the properties that are not included are not loaded, and EF6 honors what should be and not be loaded. There is no request for loading the Gages property of Status, but the property is being loaded. There are cases when these properties will be loaded so turning it off by ignoring in json is not the answer.

When serializing to return to front end we want a list of gages with their status. Suppose there are a 100 gages returned to the front end, and each has a status, with the behavior or EF Core the size of the payload will be increased by 100 fold.

If include is honored this would be a great solution. I proposed a boolean to turn off the fixup if there if include code cannot be made to behave like EF 6.

The doc says:

Entity Framework Core will automatically fix-up navigation properties to any other entities that were previously loaded into the context instance. So even if you don't explicitly include the data for a navigation property, the property may still be populated if some or all of the related entities were previously loaded.

My interprtation of this is the first query, for gages, the list returned would be just the gages and their status (no Gages would be set on the status). However, on the second query and every query after, for gages, the Gages property will be "fixed up". So, except on the first query, the json payload in the 100 gages case will always be 100 times larger than the original request.

ajcvickers commented 6 years ago

@dgxhubbard EF6 and EF Core behave in the same way here. Also, you mention a first and a second query, but your code above only shows a single query.

dgxhubbard commented 6 years ago

Code is part of a web api, so there will be multiple queries executed against it. On the first query the connection will store the gages for the next request. Then every request after it will Fixup the object gragh so that Gages of Status will be filled with items. I will double check my EF6 code.

dgxhubbard commented 6 years ago

I am sorry you are right it does have this behavior. I would not have thought that because of serialization over the wire

I will go back to my original comment on that. In one case I have a WPF app that makes calls against the database, using EF6. In this case there are references to the gages set in the Gages propety and these are in memory, so the size increases but not as much as the serialization in json. Using json, the gages are sent out over the wire so they are not sent as a reference but as a gage and its properties, so the json payload will be roughly increased in size by the number of gages that are returned.

I added the code mentioned in the article

            .AddJsonOptions (
                options =>
                {
                    options.SerializerSettings.ReferenceLoopHandling = 
                            Newtonsoft.Json.ReferenceLoopHandling.Ignore;
                    options.SerializerSettings.ContractResolver = new DefaultContractResolver ();
                } );

and checked the result in postman, and each gage, in Gages property of Status, the properties of the gage are listed.

dgxhubbard commented 6 years ago

Also in the mock code here I have only listed minimal properties there are more than properties on Gage and Status.

ajcvickers commented 6 years ago

@dgxhubbard In the code above, a new context is created for each query, so the context will not be tracking anything each time the query executes.

Tarig0 commented 6 years ago

Unless you're setting the context as a singleton.

ajcvickers commented 6 years ago

@Tarig0 The code above uses "new".

Tarig0 commented 6 years ago

Ah yes lost the context

dgxhubbard commented 6 years ago

Tracking anything? I don't understand. A new context is created, but doesn't the context use results that have already been cached? If that is where you are going

dgxhubbard commented 6 years ago

I do know the Gages property on Status is filled in on the first query

ajcvickers commented 6 years ago

@dgxhubbard Nope.

dgxhubbard commented 6 years ago

My test with the EF 6 code and EF core both showed the same thing. The Gages property is set. This test was start the web api project and hitting with one request from postman. There were no other requests. If there is a problem with what I am doing or interpreting the result please correct me. Sorry this uses full code and there is json in the txt file.

Gages.txt

ajcvickers commented 6 years ago

@dgxhubbard context.Gages.Include("Status") means for every Gage, also load the Status relationship. The relationship consists of one foreign key property and two navigation properties--Gage.Status and its inverse Status.Gages. When loading the relationship both navigation properties are set.

dgxhubbard commented 6 years ago

We are running EF Core 2.1 and according to the doc for lazy loading if you call UseLazyLoadingProxies, which we do, then any virtual nav property will be lazy loaded. If I am wrong please correct me. My example above is pseudo and leaves out the important face that we are calling UseLazyLoadingProxies. I apologize for that.

dgxhubbard commented 6 years ago

Here is our factory method to create a context.

public static MyContext Create ( bool useLogger = false )
        {

            var optionsBuilder = new DbContextOptionsBuilder<GtContext> ();
            if ( optionsBuilder == null )
                throw new NullReferenceException ( "Failed to create db context options builder" );

           optionsBuilder.UseSqlServer ( MsSqlConnectionString );
            optionsBuilder.UseLazyLoadingProxies ();

            if ( useLogger )
                optionsBuilder.UseLoggerFactory ( DbContextLoggingExtensions.LoggerFactory );

            var context = new MyContext ( optionsBuilder.Options );
            if ( context == null )
                throw new NullReferenceException ( "Failed to create context" );

            // disable change tracking
            context.ChangeTracker.AutoDetectChangesEnabled = false;
            context.ChangeTracker.QueryTrackingBehavior = QueryTrackingBehavior.NoTracking;

            return context;
        }
ajcvickers commented 6 years ago

@dgxhubbard I don't follow how lazy loading is relevant to this. I would suggest that if the repro code you posted does not demonstrate the issue you are seeing, then it would be very useful to post a full repro that does what your product code does so that we can reproduce and understand the issue.

dgxhubbard commented 6 years ago

Why wouldn't lazy loading be relevant here. Gages is a navigation property and it is virtual. If lazy loading is enabled then Gages would not be loaded.

EF Core will then enable lazy-loading for any navigation property that can be overridden--that is, it must be virtual and on a class that can be inherited from. For example, in the following entities, the Post.Blog and Blog.Posts navigation properties will be lazy-loaded.

Lazy loading is delaying the loading of related data, until you specifically request it. So is that not relevant here? Here we don't want Gages property loaded it is virtual hence it should not be loaded. Am I wrong?

ajcvickers commented 6 years ago

@dgxhubbard Lazy loading doesn't work like that. It never prevents or delays a relationship being loaded when that relationship is specified to be eager loaded using Include.

dgxhubbard commented 6 years ago

But the include is only for the Status property, not Status.Gages. Refer to Include

Here the remarks indicate that the property must to dot referenced for it to be included:

To include a reference and then a reference one level down: query.Include(e => e.Level1Reference.Level2Reference).

Is my interpretation correct here?

ajcvickers commented 6 years ago

@dgxhubbard Status and Gages are the navigation properties for a single relationship. Include tells EF to load the relationship, which involves populating both navigation properties.

Maybe it would be easier to think about it from the relational side. Each Gage row in the database has one FK value that indicates it is related to a Status with that PK value. So, this means that that Status row and that Gage row are related. It follows that when entities are created from these rows that the two entities are related. This is reflected in the object graph by setting the Gage.Status property to reference the related objects in one direction. and adding the Gage to the Status.Gages collection to relate the objects in the other direction. These are both representations of the same relationship.

dgxhubbard commented 6 years ago

Thank you very much for the time and effort to explain this. The end result of this I will need to clear the Gages property in the web api code in order not to have it set. We want to send back, what we consider to be the primary properties of the Gage. I wish there were some extension method that said this is the nav property to load and don't follow the nav propeties of the item I am referencing.

Tarig0 commented 6 years ago

Seems to me like changing the json serialize to a breadth first approach would smooth out that json file. But there is no duplication of information there. Another approach is limiting the levels you serialize, but that's just as difficult to maintain as clearing the lists.

I also see that you are also loading the Supplier reference which is what really starts making the json returned a mess.

dgxhubbard commented 6 years ago

yes it does. supplier is just like status with a gages list. The gages list is there to tell ef core there is a relationship, that's it. that's why I would like tighter control of what get loaded. With either an option to say please don't load nav properties in associated items like status and supplier or a new extension method. Something that does not break code for people who depend on the loading of associated properties and something for those who don't want them loaded

ajcvickers commented 6 years ago

@dgxhubbard If you don't want the Gages property, then remove it. EF doesn't need it.

dgxhubbard commented 6 years ago

How then does ef know the one to many relationship. We use attributes to tell ef about the relationships.

ajcvickers commented 6 years ago

@dgxhubbard Just remove it--that relationship looks like it will be one-to-many by convention. If you want to explicitly configure it, then something like:

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder
        .Entity<Gage>()
        .HasOne(e => e.Status)
        .WithMany();
}
dgxhubbard commented 6 years ago

I looked on this reference. I thought I had to have both ends of the relationship. That is convention 4 in the article. So the convention can be just on Gage (convention 1 in the article). That's great thank you.

Configure one to many relationships

dgxhubbard commented 6 years ago

There will be cases when we need the Gages that have a Status. So removing the Gages is a good work around for limiting the payload returned. However, for a case where you need the link on both ends then it is not ideal. So its a tradeoff in design. Smaller payload versus functionallity for rare cases.

ajcvickers commented 6 years ago

Discussed in triage and decided to move this to the backlog as a general idea for providing more flexibility in deciding which navigation properties should be populated. We need to be careful that we don't further confuse things when tracking is being used, but for no-tracking queries it is likely relatively safe to only set one side of the relationship and not the other.

Notes for others reading this thread:

Tarig0 commented 6 years ago

@dgxhubbard

There will be cases when we need the Gages that have a Status.

If the goal is to get the list of gauges with the status X then the following should be fine.

using ( var context = new MyContext() )
  {
    res =
        ( from c in
            context.Gages.Include(c=>c.Status) // include only needed if you require info about the status outside of the Status Name
where c.Status.StatusName == "Active"
             select c ).ToList ();
}

I typically use the ling version of this so forgive any syntax issues but that should give you a list of Gages based on their status.

This way you don't need the inverse property.

dgxhubbard commented 6 years ago

Thank you. That is going to be one of our workarounds. For now we are going to just clear nav lists we do not need. When we delete a status then all the gages that have that status will need to be pulled, and since we are pulling the status anyway leaving Gages alone meets our needs better (since it is one pull from the database).

habefvd01 commented 6 years ago

I also think it is better there is a way to disable the "Fix-up" properties. I meet the situation that sometimes I need get both parent entity and associated child entities data which I just explicitly use .include method is OK. but sometimes I only need get parent entity data, but even if I didn't use .include method,just use something like _db.parentEntity.FirstOrDefault(...) it also automatically get the child entities data together.I have to do extra work to exclude the child entities.

fabricejumarie commented 6 years ago

@dgxhubbard thank a lot about your comment. I had the same issue on my application, my repository included unexpected navigation properties in the result of my request and as I have circular references in my data model, I retrieved several Mb of data instead of a couple of Kb. The solution is to either do not track entities in previous requests (AsNoTracking) or singletonized your dbcontext if you use repository pattern. Do you know if the issue is solved in the last version of EF Core 2.1.3?

eltiare commented 6 years ago

I, too, would like for a feature like this. Having to manually go through and remove preloaded relations that I have not requested would save a lot of headaches for serialization.

codism15 commented 5 years ago

I am using EF Core 2.1.4 and I am having this problem. The following code will recreate the issue:

var result = db.Child
  .Include(c => c.Master)
  .FirstOrDefault();
return Json(result);

If both child and master has the corresponding navigation property, the result will choke the serialization with exception:

Newtonsoft.Json.JsonSerializationException: Self referencing loop detected with type...

If you put a break point and inspect the result before it is returned, you can see both child and master's navigation property are populated with references to each other. What's worth noticing is that instead of having all of its children, master entity only has the child loaded in this query, which ultimately defies the purpose of auto fixing as it is giving an incomplete view of the data.

We really need a way to disable this feature described in loading the related data

Entity Framework Core will automatically fix-up navigation properties to any other entities that were previously loaded into the context instance. So even if you don't explicitly include the data for a navigation property, the property may still be populated if some or all of the related entities were previously loaded.