Cache subqueries - Githubissues

vsopko commented 5 years ago

It may be cool, unique feauture to populate cache while enumerating related subqueries like this:

var query = context.DataSetA.Select(a => new
{
    prop1 = a.prop1,
    prop2 = context.DataSetB.Where(b => b.Id == a.Id).Cacheable().Select(b => new
    {
         b.prop2
    })
})

VahidN commented 5 years ago

Defining a sub-query here is meaningless. Because the second level cache is disconnected from the context and it will read the result from the cache, not from the context and its related database. When you are using the second level cache, the EF's query parser can't translate this part to an SQL and produce a single SQL statement for the outer and inner queries, because at the end it's an in-memory result and completely disconnect from the context (look at it as a LINQ to Objects not LINQ to Entities). So convert your sub-query to an independent query and then use its result (LINQ to Objects).

vsopko commented 5 years ago

@VahidN, I understand that you simply get result SQL and hash from it, but it would be great to be able to cache joined or subquery entities or collections. Now there is no caching solutions for applications more complex then brick, where some complex data changes frequently and some not. For example realty where for each entity we have address. There is no sense to cache all entities with relations, because frequent cache invalidation kills the aim of cache, and also there is no sense to cache billions of addresses, because often requested is only small part of them. And also i think that it is possible to get required SQL to get hash for it. The only thing is to modify Cacheable extension logic with EF Core tightly integration. EF Core has enought placess to hookup and get all you need to get a result. QueryCompiler, where you can modify expressions, SelectExpression and QuerySqlGeneratorFactory, where you can get all the required data, while visiting result expression for left joined and inner joined tables. @SteffenMangold what do you think about this? I see that you quickly launched some improvements for this library and studied the issue of EF Core SQL generation (waiting for integration with Redis with your implementation)

VahidN commented 5 years ago

This is not just about hashing. Second level caching works like this: 1st call: read from the database -> LINQ to entities 2nd call: read from the cache -> LINQ to objects . . . nth call: read from the cache -> LINQ to objects

You can combine LINQ to Objects with LINQ to entities in EF Core, it's called client evaluation here and in this case it will fetch the whole records and then applies the client's logic. Why? Because SQL Server provider has no insight into how this method is implemented, it is not possible to translate it into SQL. If your query returns 1000 rows from the database, your processor will spend a long time running your caching method 1000 times. So you can't combine these 2 efficiently with outer and inner queries here.

SteffenMangold commented 5 years ago

@vsopko "there is no sense to cache billions of addresses, because often requested is only small part of them" There is a mechanic for this DbSet.Local. You can already load all Addresses to Local and then query from the local store.

You initial question query is simply not possible because it would require to tell the database the complete cached result and to join on it. The query you wrote is a simple subselect syntax, written in Linq.

Because of the problems with ToSql I recently created a different caching solution that works closer to the internal EF Core logic (I think). You can find it here.

lock[bot] commented 4 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related problems.

VahidN / EFSecondLevelCache.Core

Cache subqueries #30