Open aradalvand opened 4 years ago
i will track this down in the new implementation for 11.
It should just work, because we prjoect like this:
{
books {
name
author {
name
}
}
}
.Select(x=> new BookDto
{
Name = x.Name,
Author = new AuthorDto
{
Name = x.Author.Name,
}
})
@PascalSenn Ok, great to hear. Then we'll have to wait for v11 before we can deploy our app so that this issue is fixed until then. Thanks.
It should just work, because we prjoect like this:
Well, as I mentioned here, running the code that you said that Hot Chocolate would generate from the GraphQL query:
queryableDto.Select(bDto => new
{
Name = bDto.Name,
Author = new
{
Name = bDto.Author.Name,
}
}).ToList();
would NOT generate the SQL that Hot Chocolate causes to be generated. I talked about this in the original comment and also here. So I think that's probably not exactly how the projection works in Hot Chocolate. And I feel like this might be a little mistake or something somewhere, it's probably not that big of a deal on your side, I don't know, I'm just guessing! But it is, however, a very big deal for our side! Since all the data of a related entity gets fetched unnecessarily when you request even just one column. So, I'd appreciate it if you look into this in the coming days! Thank you so much.
Does this query generate the desired result?
queryableDto.Select(x=> new BookDto
{
Name = x.Name,
Author = new AuthorDto
{
Name = x.Author.Name,
}
}).ToList();
@PascalSenn Yes, that query would generate the following SQL:
SELECT [b].[Name], [a].[Name]
FROM [Books] AS [b]
INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]
which is perfect. It only retrieves the name of the book and the name of the author, as requested.
However, sending this GraphQL query to our Hot Chocolate server:
{
books {
name
author {
name
}
}
}
would generate this SQL instead:
SELECT [b].[Name], [a].[Id], [a].[Name], [a].[Age]
FROM [Books] AS [b]
INNER JOIN [Authors] AS [a] ON [b].[AuthorId] = [a].[Id]
which as you can see is unnecessarily fetching the Id
and Age
of the author, while they were not queried.
I can also create a public GitHub repo and push this project into it so you can clone it and look into it and into the generated queries yourself. A super-simple project with less than 15 classes in total.
Yes a repro would be great
Are there any custom resolvers on author?
@PascalSenn Sure! I'll do it and let you know. And no, nothing, it's a just class with a handful of properties. I've added nothing HotChocolate-specific to it at all. I'll create the repo shortly ;)
@PascalSenn Here you go, sir: https://github.com/AradAral/AspNetCoreGraphQL
Edit: I added some seed data for the database so if you want to clone the project you can just adjust the connection string and then run one initial migration, as you know, and then you can use the GraphQL endpoint and inspect the generated SQL queries. I also added a MyController
, which manually does the querying; you can check the query that that generates too.
Thanks in advance.
@PascalSenn: :)
@AradAral cool, the repro is definitely helpful. Checked the configuration didn't see anything obvious wrong with it.
@PascalSenn I just noticed something else which I think might be important:
I removed the UseSelection
middleware from the GetBooks
method, but it still gave me the author data if I queried it, which normally wouldn't happen unless you use the UseSelection
middleware.
Do you think that can tell us anything about the issue?
Did you also remove it from the resolver declared in the QueryType?
It is probably always requested in this case, because Select statement that maps from object to dto projects it already
Yes, I actually removed the whole QueryType
class and it's now just the Query
class, without the UseSelection
attribute on top of the resolver.
Now that the UseSelection
middleware is not present irrespective of what field I query everything (including the author data) is retrieved.
@PascalSenn
I think I just found out what's causing the problem:
By building an expression visitor and inspecting the expression that Hot Chocolate passes to the Select
method, I just realized that it (the expression Hot Chocolate generates) looks like this:
e => new BookDto()
{
Name = e.Name,
Author = e.Author == null ? default(AuthorDto) : new AuthorDto() { Name = e.Author.Name }
}
The important bit is where it's checking whether Author
is null or not (e.Author == null ? ...
), that, as it turns out, is exactly what's causing the whole author object to be fetched from the database. This works properly with EF Core domain classes, in other words, if e.Author
was an entity class, but not with DTO classes like this.
So, now, what you do think can be done about this? Does Hot Chocolate allow us to disable null checks so that it leaves it up to me to check for nulls, or perhaps do the null check differently so as to avoid this, or really anything that could solve this problem somehow? Thank you in advance!
Based on the analysis above, I'll close this issue.
For anyone curious, I eventually ended up writing a custom expression visitor that replaces any occurrence of e.Author == null
that Hot Chocolate generates to e.Author.Id == null
, so as to avoid the aforementioned over-fetching problem.
@PascalSenn Hi! I didn't know that, what change did you make to fix this? I thought there was nothing you could do about it.
Sorry i messed something up. This could still be an issue
we just fixed UseProjections to do this:
Author = e.Author == null ? default(AuthorDto) : new AuthorDto() { Name = e.Author.Name }
I need to check what it does with reprojection
@PascalSenn
But isn't Author = e.Author == null ? default(AuthorDto)...
what HC already did in the previous versions? What has changed exactly?
Yes, but UseProjection did something different. It returned null values. I was thinking this issue was related to that, then i read more into it :) sorry for the messup
@PascalSenn Aha! Okay, got it! No problem ;)
It looks like there's still an issue on deeply nested entities.
[UseProjection]
public IQueryable<BookDto> GetBooks([Service]AppDbContext dbContext)
{
return dbContext.Books.Select(book => new BookDto
{
Name = book.Name,
...
Author = new AuthorDto
{
Id = book.Author.Id,
Name = book.Author.Name,
Address = new AddressDto
{
Street = book.Author.Address.Street
}
}
});
}
where the Address
is still joined when we only query like so
{
books {
name
author {
name
}
}
}
@gojanpaolo Hi!
It's actually not just about deeply nested objects. Whenever you have something like [Something] = new [Something]Dto { }
in your projection that you then return in your resolvers, even one field getting requested by the GraphQL client on that [Something]Dto
causes the whole object to be retrieved from the database. I explained why that happens here.
The reason why in your case Address
is being retrieved is because, again, you're querying Author.Name
and consequently the whole Author
object is getting retrieved, which means everything inside of it is getting retrieved, including every nested object, like Address
in your case.
I was expecting the projection to behave similar to
var dto =
dbContext.Books.Select(book => new BookDto
{
Name = book.Name,
...
Author = new AuthorDto
{
Id = book.Author.Id,
Name = book.Author.Name,
Address = new AddressDto
{
Street = book.Author.Address.Street
}
}
});
var result =
dto.Select(bookDto => new
{
Name = bookDto.Name,
Author = new
{
Name = bookDto.Author.Name
}
})
.ToList();
which EF correctly translates to the SQL query I expected, i.e. no join on the Address
.
I believe this is something we can fix either in hotchocolate/UseProjection
itself, or in my graphql layer code.
Im also experiencing overfetching:
the query
users { id }
works fine and produces the following SQL
SELECT [u].[Id]
FROM [User] AS [u]
whereas this query
users {
id
author { name }
}
unecessary loads more data:
SELECT [u].[Id], [a].[Id], [a].[Name], [a].[UserId], [b].[Id], [b].[AuthorId], [b].[Title]
FROM [User] AS [u]
LEFT JOIN [Author] AS [a] ON [u].[Id] = [a].[UserId]
LEFT JOIN [Book] AS [b] ON [a].[Id] = [b].[AuthorId]
ORDER BY [u].[Id], [a].[Id], [b].[Id]
[UseProjection]
public IQueryable<User> Users([Service] DataContext ctx)
{
return ctx.User
.Include(v => v.Author)
.ThenInclude(v => v.Books);
}
Using efcore 5 and hot chocolate 11
I see this issue was closed alongside a load of AutoMapper tickets, but this doesn't seem related to AutoMapper? I'm having the same issue here using projections, where all fields are requested in the SQL query, despite only requesting one field of a sub-object
@brickscrap can you share your resolver and query?
Ignore me - despite trying to figure it out for hours before commenting here, 10 minutes after posting I found the problem. I had UseProjection enabled on the parent object, and this seemed to be causing issues. Removed it, and it's all working as expected now.
Sorry for bothering you
I am getting the same issue where it is fetching all the columns in child instead of what is selected.
query (im using serial execution)
public class Query
{
[UseProjection]
public IQueryable<PersonDto> GetPersons([Service] SomeDbContext dbContext)
{
return dbContext.Persons.Select(person => new PersonDto
{
Id = person.Id,
FirstName = person.FirstName,
LastName = person.LastName,
Guardian = person.Guardian != null ? new GuardianDto
{
Id = person.Guardian.Id,
FirstName = person.Guardian.FirstName,
LastName = person.Guardian.LastName
} : default
});
}
}
dtos (objects)
public class PersonDto
{
public int Id { get; set; }
public string FirstName { get; set; } = null!;
public string LastName { get; set; } = null!;
public GuardianDto? Guardian { get; set; } = null!;
}
public class GuardianDto
{
public int Id { get; set; }
public string FirstName { get; set; } = null!;
public string LastName { get; set; } = null!;
}
selections
query {
persons {
id
firstName
guardian {
id
firstName
}
}
}
sql
SELECT "p"."Id", "p"."FirstName", "g"."Id" IS NOT NULL, "g"."Id", "g"."FirstName", "g"."LastName"
FROM "Persons" AS "p"
LEFT JOIN "Guardians" AS "g" ON "p"."GuardianId" = "g"."Id"
here's a repo if you need the fullest context: https://github.com/nollidnosnhoj/testchocolate
EDIT: another thing interesting is when i did this with a collection of children dtos, it projected correctly.
For anyone still having this problem, I explained the reason why it's happening in this comment.
I think since this pattern of having "intermediary projections" (like DTOs) is rather common when using Hot Chocolate along with Entity Framework, it would make sense if Hot Chocolate provided a configuration setting or something on its projection engine to allow the developer to choose how the existence or non-existence of a single related entity is determined in the projection.
So let me reopen this issue:
The reason why this check is there, is to ensure correct nullability in the projection. If this check is not there the projection will fail if you run the query in memory. What format would work for you? just enforcing the format?
I think since this pattern of having "intermediary projections" (like DTOs) is rather common
It is actually not. Most of the time you do not need these intermediary objects. Normally the Types are your DTO. If you do 1:1 mappings you essentially write a DTO for nothing. The type control the visibility of fields and such. A clean Domain layer leads in most cases to a clean GraphQL schema.
@PascalSenn
The reason why this check is there, is to ensure correct nullability in the projection. If this check is not there the projection will fail if you run the query in memory.
Sure, I understand the reason for that check. I'm not saying it should be removed.
It is actually not. Most of the time you do not need these intermediary objects. Normally the Types are your DTO.
I beg to differ, Pascal. The cleanness of the underlying entity classes has nothing to do with whether or not you would need intermediary objects. Consider the following entity structure:
public class Product
{
public int Id { get; set; }
public string Title { get; set; }
public ICollection<ProductRating> Ratings { get; set; }
}
public class ProductRating
{
public int ProductId { get; set; }
public Product Product { get; set; }
public byte Rating { get; set; }
}
In this case, you might want the type Product
in your GraphQL schema to have a field called averageRating
. You don't have such a field on your entity classes and you shouldn't, as this is a "computed" field.
This is where intermediary projections would be necessary:
dbContext.Products.Select(p => new ProductDto
{
AverageRating = p.Ratings.Average(r => r.Rating),
});
Apart from computed fields, there's also a lot of low-level details in your entity structure that you would most likely want to hide from the user of your GraphQL API. An example of that would be foreign key properties like ProductId
, etc. You don't want these to appear in your GraphQL schema.
So, wouldn't you agree with me on this?
Back to the overfetching issue as a result of the null check in the projection: What I've personally done is I've written an expression visitor that replaces every occurrence of those null checks, with null a check on the ID of the object. So it turns the following (which is what HC generates):
Author = e.Author == null ? default(AuthorDto) : new AuthorDto() { ... }
into:
Author = e.Author.Id == null ? default(AuthorDto) : new AuthorDto() { ... }
With this, the overfetching issue is sovled and EF Core generates exactly the right SQL.
Everything you mentioned about computed fields and ignoring sensitive members is already solved by Hot Chocolate's type system. You can easily add new computed fields using Type Extensions and hide members from your entities using various APIs in Hot Chocolate. This does not require intermediary objects that need to be involved in the projection.
In this case, you might want the type Product in your GraphQL schema to have a field called averageRating. You don't have such a field on your entity classes and you shouldn't, as this is a "computed" field An example of that would be foreign key properties like ProductId, etc. You don't want these to appear in your GraphQL schema.
Exactly, computes fields and ignores. This is what types can do too.
// Annotation Based example, but same thing works also in code or schema first.
[ExtendObjectType(typeof(Product), , IgnoreFields = new []{nameof(Product.ProductId)})]
public class ProductExtensions {
public double GetAveragePrice(....) =>...
}
I left the service injection up to your imagination (...). You can either have a domain method that calculates the rating, you can data load the ratings, you can do both of the above and many other things.
Don't get me wrong, I think that there are some cases where a projection could be beneficial. Though what I see most often with DTO's is that it ends up in mostly 1:1 mappings and often a resolver could do the job when there is some field that you need.
Author = e.Author.Id == null ? default(AuthorDto) : new AuthorDto() { ... }
The problem here is that we do not know the key. The key could also be e.Author.SomeForeignKeyId
.
@michaelstaib Thoughts on new syntax to identify the key?
Everything you mentioned about computed fields and ignoring sensitive members is already solved by Hot Chocolate's type system. You can easily add new computed fields using Type Extensions and hide members from your entities using various APIs in Hot Chocolate. This does not require intermediary objects that need to be involved in the projection.
@tobias-tengler Regarding ignores, you're right. But computed fields, not really. Because with type extensions, the new fields that you write are not added to the projection expression, which means they would result in additional SQL roundtrips, a significant disadvantage, obviously, especially if you have a relatively large number of computed fields. With DTOs and intermediary projections, however, you won't have this problem.
There's also the issue of developer ergonomics, I would argue it's cleaner to have classes that directly correspond to your resulting GraphQL types. Again, with DTOs, you'll get this benefit as well. But with type extensions and ignores, this becomes a bit more complex to look at and harder to immediately parse in your head. But this point, of course, can be subjective.
Tell me if I'm missing something.
The problem here is that we do not know the key. The key could also be e.Author.SomeForeignKeyId
@PascalSenn I think a simple method on IObjectFieldDescriptor
would do:
descriptor.Field(o => o.Id)
.Key();
Whenever the key is specified like this, Hot Chocolate projection engine would do the null checks on this property, entirely solving this problem. What do you think?
descriptor.Field(o => o.Id).Key();
This would be the easiest solution indeed, but it might be bad in the bigger pictues.
Key()
is a pretty strong term, that could be used for many different use cases (stitching, execution engine etc.)When we identity keys on types, we can only do this in respect to the bigger picture. There are a lot of usecases where we need keys, so this API is not a easy decision.
You have to add the key for projections to work properly
Well, yes, of course this is necessary only when you have intermediary projections. I think a note in the docs in the projections section pointing this out would suffice.
What happens if you select two keys
Maybe an exception? Unless you want to support composite keys, in which case I'm not sure.
When we identity keys on types, we can only do this in respect to the bigger picture. There are a lot of usecases where we need keys, so this API is not a easy decision.
Sure, currently we don't have this key thing as far as I know, so I think we could start with this particular use case and then we could also gradually start using it for other things (like stitching and stuff, as you mentioned).
...which means they would result in additional SQL roundtrips, a significant disadvantage, obviously, especially if you have a relatively large number of computed fields.
I wanna share an other case I'm using a lot in my projects.
public class Product
{
public int Id { get; set; }
public int BrandId { get; set; }
public string Name { get; set; }
public Brand Brand { get; set; }
}
public class Brand
{
public int Id { get; set; }
public int ProductId { get; set; }
public string Name { get; set; }
public ICollection<Product> Products { get; set; }
}
public class Query
{
[UseProjection]
public async Task<IQueryable<BrandDto>> GetBrands(
[Service] ApplicationDbContext dbContext,
[Service] IProductFilterService productFilterService,
ProductFiltersInputModel productFilters)
{
var productFilterExpression = await productFilterService.BuildProductFilterExpressionAsync(productFilters with { BrandIds = null });
return dbContext.Brands.Select(brand => new BrandDto
{
Id = brand.Id,
Name = brand.Name,
TotalProductCount = dbContext.Products.Count(product => product.BrandId == brand.Id)
FilteredProductCount = dbContext.Products.Where(product => product.BrandId == brand.Id).Count(productFilterExpression),
});
}
Of Course this approach isn't that bad but a way to configure this in my ObjectType would be awesome.
public class BrandType : ObjectType<Brand>
{
protected override void Configure(IObjectTypeDescriptor<Brand> descriptor)
{
descriptor.Field("FilteredProductCount")
.Type<IntType>()
.Argument("productFilters", x => x.Type<ProductFiltersInputModel>())
.Computed(async context =>
{
var productFilters = x.ArgumentValue<ProductFiltersInputModel>("productFilters");
var productFilterService = context.Service<IProductFilterService>();
var dbContext = context.Service<ApplicationDbContext>();
var productFilterExpression = await productFilterService.BuildProductFilterExpressionAsync(productFilters with { BrandIds = null });
LambdaExpression selector = (Brand brand) => dbContext.Products.Where(product => product.BrandId == brand.Id).Count(productFilterExpression);
return selector;
});
}
}
With this approach we could fetch this computed fields from other ObjectType's.
@PascalSenn first of, thank you and the team for creating an awesome library, I am really enjoying it so far.
Secondly, I did a bit of an investigation of the original bug report and I think I have nailed down the issue.
The issue begins in this line of code as others have already pointed out.
However, removal of "NotNullAndAlso" is not enough to resolve this issue, unfortunately the issue is in this line of code as well because it is just taking the whole property, without considering what parts of the property are selected.
Basically, in the example of the original bug report what hotchocolate is doing is this:
Author = x.Author
While it should actually be doing something like this (hopefully?):
Author = new AuthorDto { Name = x.Author.Name }
Here is a screenshot of the somewhat expected code change to support this:
--
Obviously this is a naive solution (it does not take into consideration nested properties?), but this does resolve the issue of overselecting the columns from the database.
Do you think some way of "opting into" this feature would be the way to go (at least for level 1 properties)? Is this something that you and the team are planning to fix at all?
Please note that this issue is quite larger than it actually seems at first.
Thankfully in the example above, only the "Name" column was overselected, however it is highly possible for more complex applications to have something like SomeProp = x.Nav1.Nav2.Nav3.Nav4.SomeColumn
which would result in 4 extra joins, even when you are not asking for the SomeProp
in your query. Not to mention the subselects, computed stuff, TVF and all those things that could be binding to the SomeProp
. I believe you can see how undesirable this is.
I am looking forward to the discussion and getting this thing resolved!
@kresimirlesic a few lines above you see the memberinit variable, this is the variable that does the projection for the properties 'x.Author = new Author { Name = x.Author.Name }'
@PascalSenn My apologies, you are completely right. I have missed that part while I was trying to figure out what is going on :)
So to continue on the discussion....
Do you think opting out of the NotNullAndAlso would be a good and simple alternative to the Key() API which was discussed above or do you already have a plan on how to resolve this in the near future?
@PascalSenn are there any plans to resolve this issue?
Once you remove the NotNullAndAlso everything works as expected with regards to EF.
It would be really great if this was supported out of the box as this is really the behavior that most of the people would expect.
So removing NotNullAndAlso
does not fix the issue:
SELECT "b"."Title", "a"."Id", "a"."Name"
FROM "Books" AS "b"
INNER JOIN "Authors" AS "a" ON "b"."AuthorId" = "a"."Id"
this is the pull request: https://github.com/ChilliCream/hotchocolate/pull/4970 We are open for suggestions
@kresimirlesic
While it should actually be doing something like this (hopefully?): Author = new AuthorDto { Name = x.Author.Name }
I'm not sure how that would solve this issue? I thought we established the fact that a null check is necessary.
The only way this problem could be reliably solved is by switching the null check from the entity itself to its key. As I explained in this comment. So the expression would look like:
Author = b.Author?.Id == null ? null : new AuthorDto { ... }
which, when run in memory, would work with without raising any exception, and also when given to EF Core, would yield the correct SQL.
Tell me if I'm missing something.
@PascalSenn Do you still believe adding a .Key()
to IObjectFieldDescriptor
— or something similar — won't be feasible?
Correct me if I'm wrong but I find it a neat solution.
@AradAlvand
I'm not sure how that would solve this issue? I thought we established the fact that a null check is necessary.
Null check is not necessary for EF, only if the query gets executed in the memory for whatever reason, then we would get the exception.
Since Pascal said that the .Key() API might be bad in the bigger picture, I just wanted to provide a quick and working alternative to this problem while the team figures our if they want to go with the .Key() API or something else.
This will resolve the inner join out of the box, but for the left join the queryable will probably need to be tweaked a bit for it to not throw an exception. At least it will not be overselecting, which I think is a lot bigger problem.
I agree with you, the Key() API or something in that way is probably the only 100% working solution.
@kresimirlesic Okay, that makes sense. But apart from whether or not this workaround would actually work or not — because @PascalSenn said in this comment that it's apparently not working? — I still think it's better if we go with a solution that covers 100% of cases, which is the Key()
API or something similar. Because removing the null check would be problematic even for EF in cases where the relationship is optional and not required. In our book/author example, imagine if we had some books that weren't linked to an author, we would need the null check in those cases.
Don't you agree?
@PascalSenn Don't you think we could add the .Key()
thing right now to solve this issue and then add further functionality on top of it in the future if need be? Would there be anything wrong with that, Pascal?
Like what I had said in this comment:
Sure, currently we don't have this key thing as far as I know, so I think we could start with this particular use case and then we could also gradually start using it for other things (like stitching and stuff, as you mentioned).
@AradAlvand The approach @kresimirlesic suggested is indeed working. @kresimirlesic found an issue in my pull request. Very well spotted @kresimirlesic!.
@AradAlvand
@PascalSenn Don't you think we could add the .Key() thing right now to solve this issue and then add further functionality on top of it in the future if need be? Would there be anything wrong with that, Pascal?
Because it doesnt solve the issue?
Author = e.Author.Id == null ? default(AuthorDto) : new AuthorDto() { ... }
This will never be true e.Author.Id == null
. Unless the ID of author is null and i doubt that this is the intend
Following https://github.com/ChilliCream/hotchocolate/issues/2365#issuecomment-697727508 I have
Book
andAuthor
domain classes +BookDto
andAuthorDto
classes, I want to expose anIQueryable<BookDto>
for thebooks
field on myquery
root type. So here is theGetBooks
method on theQuery
class: It simply projectsBook
intoBookDto
and returns anIQueryable<BookDto>
so that Hot Chocolate could then do its magic on it.Now, when you run the following query:
You would probably expect the following SQL to ultimately get generated:
However, instead, Hot Chocolate causes this one to get generated by EF Core:
Which retrieves all the fields on the author (including
Age
andId
in this case), even though the query requested only theName
.At first, I thought this must be an EF Core issue, but then I executed the following code:
And I realized it works as expected, and generates the following SQL which only selects the requested column (author name in this case):
Therefore, this must be related to Hot Chocolate, I believe. This is really a serious problem for situations like mine when we don't want to expose
IQueryable<TheEntityFrameworkModelClass>
but rather a different type.I'm using Hot Chocolate version 10.5.2 And I've tried this with both EF Core version 3.1.8 and 5.0.0 RC.